feat: Fast slow separate by Z-G-H1 · Pull Request #3213 · OpenAtomFoundation/pikiwidb

Z-G-H1 · 2026-01-03T14:19:38Z

1. 概况

围绕“命令线程池借用”将命令处理拆分为快/慢线程池，并引入跨池借用机制，同时完善线程池相关的监控输出与在线配置能力，提升负载不均衡下的资源利用率与可观测性。

2. 核心功能点

2.1 快慢命令线程池分离（Fast / Slow）

命令处理任务在调度阶段会被分配到：

快池：默认处理绝大多数命令
慢池：当开启 slow-cmd-pool 且命令被判定为慢命令时使用（slow-cmd-list）

2.2 跨池借用（Borrow）调度策略

引入借用逻辑：当某池处于忙碌、另一池处于空闲时，允许把任务投递到对方池子以分担压力。

快池忙、慢池闲 → FAST 借用 SLOW
慢池忙、快池闲 → SLOW 借用 FAST

决策入口统一在 PikaServer::DecidePoolType(...)，路由入口在 PikaServer::ScheduleClientPool(...)。
在任务参数中记录 pool_type，并在执行入口将其写入连接上下文，使得：

线程池统计（scheduled/active_tasks/latency）按实际执行池归属；
能够针对线程池维度进行观察。

3. 忙闲判断指标

使用队列等待时间以及线程池任务队列占比作为判断指标；

队列等待时间（queue-wait）EMA 指标

排队等待时间：queue_wait_us = dequeue_ts_ - enqueue_ts_
使用 EMA（指数移动平均）对 queue-wait 做平滑计算。

队列长度占比判断 busy/idle：

threadpool-borrow-threshold-percent
threadpool-idle-threshold-percent

活跃任务数占比 (busy_by_active)

使用 active_tasks 实时追踪线程池正在执行的任务数
当 active_tasks >= (thread_num * 80%) 时 → 认为忙碌

组合判定规则（用于借用 Busy/Idle）

Busy：busy = busy_by_queue || busy_by_ema || busy_by_active
Idle：idle = idle_by_queue && idle_by_ema

Busy 判定：
只要任一信号表明“忙”，就认为忙。

Busy = busy_by_queue || busy_by_ema || busy_by_active

Idle 判定：更保守
必须两个信号都表明“闲”，才认为该池是空闲，才允许借用。

Idle = idle_by_queue && idle_by_ema && idle_by_active

4. 可观测性

线程池运行信息（info threadpool）
线程池信息输出增强，包含：

fast/slow pool：size, queue_size, max_queue_size, usage
borrow 相关统计：borrow_attempts
调度统计：tasks_scheduled

5. 参数动态调整

5.1 新增/扩展的 CONFIG GET

支持查询：

threadpool-borrow-enable
threadpool-borrow-threshold-percent
threadpool-idle-threshold-percent

5.2 新增/扩展的 CONFIG SET

支持在线设置：

thread-pool-size（快池线程数动态调整，队列为空时允许）
slow-cmd-thread-pool-size（慢池线程数动态调整，队列为空时允许）
threadpool-borrow-enable
threadpool-borrow-threshold-percent
threadpool-idle-threshold-percent

6. pika.conf 新增配置项（线程池 EMA 参数）

在配置文件中新增以下参数，用于控制 EMA 与阈值（单位 microseconds）：

threadpool-ema-alpha-numerator
threadpool-ema-alpha-denominator
threadpool-fast-busy-threshold
threadpool-fast-idle-threshold
threadpool-slow-busy-threshold
threadpool-slow-idle-threshold

7. 调度链路（从接收请求到入池/借用）

网络线程解析命令后，构造后台任务参数（包含 redis_cmds、pool_type 等）。
调度入口 ScheduleClientPool 调用 DecidePoolType 选择目标池：

默认 fast/slow 分流
满足借用条件时进行跨池投递

coderabbitai · 2026-01-03T14:19:45Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR implements a fast/slow thread pool separation mechanism with cross-pool borrowing to improve resource utilization under unbalanced load conditions. Commands are classified and routed to either a fast or slow pool, with intelligent borrowing when one pool is busy and another is idle.

Key changes include:

Thread pool task borrowing mechanism based on queue occupancy and EMA-smoothed wait times
Enhanced observability with thread pool metrics (latency buckets, scheduled/completed task counts, borrow attempts)
Dynamic thread pool resizing via CONFIG SET commands

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
conf/pika.conf	Adds thread pool borrowing configuration parameters, EMA thresholds, and enables slow-cmd-pool by default
tools/pika_migrate/conf/pika.conf	Contains an unintended typo in userblacklist and adds blank line
include/pika_define.h	Defines TaskPoolType enum for fast/slow pool classification
include/pika_conf.h	Adds configuration fields and accessors for borrowing thresholds and EMA parameters
include/pika_server.h	Introduces ThreadPoolMetrics struct and methods for pool management, borrowing logic, and EMA tracking
include/pika_admin.h	Extends INFO command to support "threadpool" section
include/pika_client_conn.h	Adds pool_type tracking to BgTaskArg and connection state
include/pika_binlog.h	Fixes printf format specifier from %llu to %lu
src/pika_conf.cc	Loads borrowing and EMA configuration from pika.conf
src/pika_server.cc	Implements core borrowing logic, metrics tracking, EMA calculations, pool resizing, and decision algorithms
src/pika_client_conn.cc	Records queue wait times, updates EMA statistics, and tracks per-pool metrics during task execution
src/pika_admin.cc	Implements CONFIG GET/SET for borrowing parameters and INFO threadpool output

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-03T14:23:37Z

src/pika_server.cc

+}
+
+bool PikaServer::ResizeSlowCmdThreadPool(size_t new_size) {
+  if (new_size == 0 || new_size > 24) {


The upper bound check is inconsistent with the error message. The code checks for 24 but the error message states the limit is 1024. For slow command thread pool, a limit of 1024 would be reasonable, but the validation should be consistent. Consider either changing the check to allow up to 1024 or updating the error message to reflect the actual limit of 24.

Suggested change

if (new_size == 0 || new_size > 24) {

if (new_size == 0 || new_size > 1024) {

Copilot · 2026-01-03T14:23:38Z

src/pika_server.cc

 // QUEUE_SIZE_THRESHOLD_PERCENTAGE is used to represent a percentage value and should be within the range of 0 to 100.
 const size_t QUEUE_SIZE_THRESHOLD_PERCENTAGE = 75;

+// ThreadPoolMetrics 


The comment has a typo: "threadpool metrics" should be capitalized or include proper punctuation as "ThreadPool Metrics" or "thread pool metrics".

Suggested change

// ThreadPoolMetrics

// ThreadPool Metrics

Copilot · 2026-01-03T14:23:38Z

src/pika_server.cc

-      role_(PIKA_ROLE_SINGLE) {
+      role_(PIKA_ROLE_SINGLE),
+      fast_pool_metrics_(new ThreadPoolMetrics()),
+      slow_pool_metrics_(new ThreadPoolMetrics()){


The initialization uses curly braces instead of parentheses which is inconsistent with the previous line (line 110). While both are valid, consistency within the same initialization list is preferred.

Copilot · 2026-01-03T14:23:38Z

tools/pika_migrate/conf/pika.conf

 # [NOTICE] If this user password is the same as admin password (including both being empty),
 # the value of this parameter will be ignored and all users are considered as administrators,
-# in this scenario, users are not subject to the restrictions imposed by the userblacklist.
+# in this scenario, users are not subject to the restrictions imposed by the userb lacklist.


There's an extra space inserted in the middle of "userblacklist", changing it to "userb lacklist". This appears to be an unintentional edit and should be reverted.

Suggested change

# in this scenario, users are not subject to the restrictions imposed by the userb lacklist.

# in this scenario, users are not subject to the restrictions imposed by the userblacklist.

Copilot · 2026-01-03T14:23:39Z

src/pika_server.cc

+  while (SlowCmdThreadPoolCurQueueSize() != 0) {
+    std::this_thread::sleep_for(std::chrono::milliseconds(100));
+  }


Blocking wait in resize operation could cause performance issues. The code waits in a tight loop checking queue size every 100ms, which blocks the CONFIG SET command execution. If the queue never drains (e.g., due to continuous load), this could hang indefinitely. Consider adding a timeout or a maximum retry count to prevent indefinite blocking.

Copilot · 2026-01-03T14:23:40Z

src/pika_admin.cc

+      res_.AppendStringRaw("-ERR Can't resize slow-cmd-pool-size when there are tasks in the queue\r\n");
+      return;
+    }
+    long int slow_cmd_thread_pool_size = (1 > ival || 24 < ival) ? 1 : ival;


The validation bound (24) is inconsistent with the upper bound check in line 2733 which allows values up to 1024. This creates a mismatch where ConfigSet allows larger values than ResizeSlowCmdThreadPool can actually handle. The limits should be consistent across both validation points.

Copilot · 2026-01-03T14:23:40Z

src/pika_server.cc

+  if (last > 0) {
+    uint64_t elapsed = now - last;
+    if (elapsed > 1000) { // If last update was more than 1ms ago, increase alpha to adapt faster
+      alpha_num = std::min(alpha_num * 2, alpha_den / 2);
+    }
+  }


The adaptive alpha calculation could cause issues. When elapsed > 1000 microseconds (1ms), alpha_num is doubled, but this modification is only applied locally and doesn't update the global EMA parameters. This means the adaptation effect is inconsistent - it only affects this one update. Additionally, if multiple threads call this simultaneously with large elapsed times, they could all compute different alpha values leading to unpredictable EMA behavior. Consider either removing this adaptive logic or implementing it more carefully with proper synchronization.

Suggested change

if (last > 0) {

uint64_t elapsed = now - last;

if (elapsed > 1000) { // If last update was more than 1ms ago, increase alpha to adapt faster

alpha_num = std::min(alpha_num * 2, alpha_den / 2);

}

}

(void)last; // last is currently unused; retained for potential future logic

Copilot · 2026-01-03T14:23:41Z

include/pika_server.h

+  std::atomic<uint64_t> tasks_completed{0};
+  std::atomic<uint64_t> borrow_attempts{0};
+
+  // latency（1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, >5s）


The comment is unclear and has a typo. "latency（1ms" uses Chinese parentheses instead of English ones, and the description "latency" should be more specific like "Latency buckets in microseconds".

Suggested change

// latency（1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, >5s）

// Latency buckets (1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, >5s)

Copilot · 2026-01-03T14:23:41Z

src/pika_server.cc

  }
-  pika_client_processor_->SchedulePool(func, arg);
+  // if slow cmd thread pool disabled
+  if(!g_pika_conf->slow_cmd_pool()) {


Missing space after 'if'. The code has 'if(!g_pika_conf->slow_cmd_pool())' but should be 'if (!g_pika_conf->slow_cmd_pool())' for consistent code style.

Suggested change

if(!g_pika_conf->slow_cmd_pool()) {

if (!g_pika_conf->slow_cmd_pool()) {

Copilot · 2026-01-03T14:23:41Z

src/pika_admin.cc

+    }
+    size_t current_queue_size = g_pika_server->SlowCmdThreadPoolCurQueueSize();
+    if (current_queue_size > 0) {
+      res_.AppendStringRaw("-ERR Can't resize slow-cmd-pool-size when there are tasks in the queue\r\n");


The error message uses inconsistent terminology. Earlier in the code and function name, it's referred to as "slow-cmd-thread-pool-size", but the error message says "slow-cmd-pool-size". The error message should match the actual configuration parameter name for clarity.

Suggested change

res_.AppendStringRaw("-ERR Can't resize slow-cmd-pool-size when there are tasks in the queue\r\n");

res_.AppendStringRaw("-ERR Can't resize slow-cmd-thread-pool-size when there are tasks in the queue\r\n");

brother-jin and others added 11 commits November 27, 2025 15:24

feat:Automated fast/slow command separation, help me

2ac82cb

fix_fast

c3797a7

fix fast

c8a029a

fix

699c9bb

fix

67cb8fe

temp commit

39b70af

rm slots

6d11752

test

8e5edb7

fast-slow-separate

a9b4953

modify

077d5ea

fast-slow-separate

21ce137

Copilot AI review requested due to automatic review settings January 3, 2026 14:19

github-actions bot added the Invalid PR Title label Jan 3, 2026

Copilot started reviewing on behalf of Z-G-H1 January 3, 2026 14:20 View session

Copilot AI reviewed Jan 3, 2026

View reviewed changes

Z-G-H1 changed the title ~~Fast slow separate~~ feat: Fast slow separate Jan 4, 2026

github-actions bot added ✏️ Feature New feature or request and removed Invalid PR Title labels Jan 4, 2026

Z-G-H1 force-pushed the fast-slow-separate branch 7 times, most recently from aa9a567 to cb1696b Compare January 5, 2026 06:22

modify

f030ae5

Z-G-H1 force-pushed the fast-slow-separate branch from cb1696b to f030ae5 Compare January 6, 2026 01:56

fast-slow-separate

3ad1fc0

chejinge changed the base branch from 3.5 to 3.6 January 19, 2026 02:11

Mixficsol approved these changes Jan 26, 2026

View reviewed changes

Merge branch '3.6' into fast-slow-separate

776e33e

chejinge merged commit d1a606e into OpenAtomFoundation:3.6 Jan 26, 2026
2 checks passed

	if (new_size == 0 \|\| new_size > 24) {
	if (new_size == 0 \|\| new_size > 1024) {

	# in this scenario, users are not subject to the restrictions imposed by the userb lacklist.
	# in this scenario, users are not subject to the restrictions imposed by the userblacklist.

	// latency（1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, >5s）
	// Latency buckets (1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s, 5s, >5s)

	if(!g_pika_conf->slow_cmd_pool()) {
	if (!g_pika_conf->slow_cmd_pool()) {

	res_.AppendStringRaw("-ERR Can't resize slow-cmd-pool-size when there are tasks in the queue\r\n");
	res_.AppendStringRaw("-ERR Can't resize slow-cmd-thread-pool-size when there are tasks in the queue\r\n");

Conversation

Z-G-H1 commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. 概况

2. 核心功能点

2.1 快慢命令线程池分离（Fast / Slow）

2.2 跨池借用（Borrow）调度策略

3. 忙闲判断指标

组合判定规则（用于借用 Busy/Idle）

4. 可观测性

5. 参数动态调整

5.1 新增/扩展的 CONFIG GET

5.2 新增/扩展的 CONFIG SET

6. pika.conf 新增配置项（线程池 EMA 参数）

7. 调度链路（从接收请求到入池/借用）

Uh oh!

coderabbitai bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Other AI code review bot(s) detected

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Z-G-H1 commented Jan 3, 2026 •

edited

Loading

coderabbitai bot commented Jan 3, 2026 •

edited

Loading