Skip to content

pkg/executor, pkg/kv, pkg/store: propagate SQL digest for MPP task metadata#66763

Open
JaySon-Huang wants to merge 4 commits intopingcap:masterfrom
JaySon-Huang:feat/mpp-sql-digest-propagation
Open

pkg/executor, pkg/kv, pkg/store: propagate SQL digest for MPP task metadata#66763
JaySon-Huang wants to merge 4 commits intopingcap:masterfrom
JaySon-Huang:feat/mpp-sql-digest-propagation

Conversation

@JaySon-Huang
Copy link
Copy Markdown
Contributor

@JaySon-Huang JaySon-Huang commented Mar 7, 2026

What problem does this PR solve?

Issue Number: ref #66762

Problem Summary:
Short MPP queries (below slow log threshold) are hard to correlate between TiDB dispatch logs and TiFlash task logs because SQL digest is not propagated in mpp.TaskMeta.

What changed and how does it work?

  • Add SQLDigest to kv.MPPDispatchRequest.
  • Extract digest from StmtCtx.SQLDigest() in local MPP coordinator, log it in dispatch/retry/error paths, and carry it in dispatch requests.
  • Populate TaskMeta.SqlDigest in TiDB MPP client for dispatch/cancel/establish connection request metadata.
  • Update module wiring to consume the kvproto change (temporary replace to the branch containing TaskMeta.sql_digest) and refresh Bazel dependency metadata via make bazel_prepare.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Validation commands run:

  • go test ./pkg/kv -run TestNonExistent
  • go test ./pkg/store/copr -run TestNonExistent
  • go test ./pkg/executor/internal/mpp -run TestNonExistent
  • make lint
  • make bazel_prepare (with writable local Bazel cache/output paths in this environment)

Before this PR

# tidb.log
[2026/03/07 17:01:43.356 +08:00] [INFO] [local_mpp_coordinator.go:222] ["Dispatch mpp task"] [timestamp=464748308917911554] [ID=1] [QueryTs=1772874103272248530] [LocalQueryId=2] [ServerID=1445] [address=10.2.12.81:3930] [plan="Table(generic_entity)->Sel([eq(test.generic_entity.type_code, CNY) or(ne(test.generic_entity.attr_id_1, ), ne(test.generic_entity.attr_id_2, ))])->Send(-1, )"] [mpp-version=2] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]
# tiflash.log
[2026/03/07 17:01:43.369 +08:00] [INFO] [MPPTaskStatistics.cpp:139] ["{\"query_tso\":464748308917911554,\"task_id\":1,\"is_root\":true,\"sender_executor_id\":\"ExchangeSender_18\",\"executors\":[...],...,"status\":\"FINISHED\",\"error_message\":\"\",\"cpu_ru\":0.6666666666666666,\"read_ru\":0.1273956298828125,\"memory_peak\":3059904,..."] [source="mpp_task_tracing MPP<gather_id:1, query_ts:1772874103272248530, local_query_id:2, server_id:1445, start_ts:464748308917911554,task_id:1>"] [thread_id=191]

After this PR, the sql digest is added to the logging

# tidb.log
[2026/03/08 14:41:45.458 +08:00] [INFO] [local_mpp_coordinator.go:244] ["Dispatch mpp task"] [timestamp=464768756710113282] [ID=1] [QueryTs=1772952105388643615] [LocalQueryId=2] [ServerID=1727] [address=10.2.12.81:3930] [plan="Table(generic_entity)->Sel([eq(test.generic_entity.type_code, CNY) or(ne(test.generic_entity.attr_id_1, ), ne(test.generic_entity.attr_id_2, ))])->Send(-1, )"] [mpp-version=3] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default] [sqlDigest=212e2a5dbe2a960ed1eb83e179cf830acfa105fe40a941c49699002fca309e74]
# tiflash.log
[2026/03/08 14:41:45.619 +08:00] [INFO] [MPPTaskStatistics.cpp:146] ["{\"query_tso\":464768756710113282,\"task_id\":1,\"is_root\":true,\"sender_executor_id\":\"ExchangeSender_19\",\"executors\":[...],...,\"connection_id\":3621781510,\"connection_alias\":\"\",\"sql_digest\":\"212e2a5dbe2a960ed1eb83e179cf830acfa105fe40a941c49699002fca309e74\",...\"status\":\"FINISHED\",\"error_message\":\"\",\"cpu_ru\":1,\"read_ru\":0.1273956298828125,\"memory_peak\":3060307,..."] [source="mpp_task_tracing MPP<gather_id:1, query_ts:1772952105388643615, local_query_id:2, server_id:1727, start_ts:464768756710113282,task_id:1>"] [thread_id=172]

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Enhance MPP observability by propagating SQL digest in TiDB MPP task metadata and related dispatch logs.

Summary by CodeRabbit

  • New Features

    • Query execution now propagates SQL and plan digests across distributed execution paths for improved tracking and diagnostics.
    • Logs expanded to include these diagnostic identifiers for better troubleshooting and visibility.
  • Chores

    • Updated a vendored dependency to a newer version with refreshed source metadata and checksum.

@ti-chi-bot ti-chi-bot bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Mar 7, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Mar 7, 2026

Review Complete

Findings: 1 issues
Posted: 0
Duplicates/Skipped: 1

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 7, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Mar 7, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cfzjywxk for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tiprow
Copy link
Copy Markdown

tiprow bot commented Mar 7, 2026

Hi @JaySon-Huang. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Bumps vendored github.com/pingcap/kvproto (DEPS.bzl and go.mod) and threads SQLDigest/PlanDigest through the MPP pipeline: session → kv.MPPDispatchRequest → mpp.TaskMeta → dispatch/cancel/connection RPCs, with related logging and minor comment fixes; no other public APIs changed.

Changes

Cohort / File(s) Summary
MPP SQL/Plan digest propagation
pkg/executor/internal/mpp/local_mpp_coordinator.go, pkg/kv/mpp.go, pkg/store/copr/mpp.go
Adds SQLDigest/PlanDigest to kv.MPPDispatchRequest and SqlDigest/PlanDigest to mpp.TaskMeta; extracts digests from session and propagates them into dispatch, cancel, and establish-connection requests; logging updated; minor comment fixes.
Vendored dependency metadata
DEPS.bzl
Updates com_github_pingcap_kvproto go_repository entry: new sha256, updated strip_prefix, and updated urls pointing to the newer pseudo-version zip.
Go modules
go.mod
Bumps github.com/pingcap/kvproto pseudo-version to v0.0.0-20260317140554-2bc1b358068b.

Sequence Diagram(s)

sequenceDiagram
  participant Session as "Session\n(SessionVars/StmtsCtx)"
  participant Coordinator as "LocalMPPCoordinator"
  participant KV as "KV Dispatcher\n(kv package)"
  participant Store as "Store / Coprocessor\n(mpp package)"

  Session->>Coordinator: Start MPP execution (includes SQLDigest, PlanDigest)
  Coordinator->>KV: Build & send MPPDispatchRequest\n(include SQLDigest, PlanDigest)
  KV->>Store: DispatchTask / EstablishConn / Cancel\n(TaskMeta/ReceiverMeta includes SqlDigest, PlanDigest)
  Store->>Store: Execute task / log using digests
  Store-->>KV: Task responses / errors (digests preserved)
  KV-->>Coordinator: Dispatch response
  Coordinator-->>Session: Report results / errors
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

approved, lgtm

Poem

🐰 I found the query's secret thread,
I tuck the digests in each task's bed,
From session burrow to coprocessor log,
They hop along through dispatch and fog,
Small stamps of truth on every sled.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: propagating SQL digest for MPP task metadata across multiple packages (executor, kv, store).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

Tools execution failed with the following error:

Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
DEPS.bzl (1)

6816-6826: Track upstream merge for kvproto fork.

The kvproto dependency is temporarily pointing to a forked version (JaySon-Huang/kvproto) to consume the TaskMeta.sql_digest field before it's merged upstream. Ensure this is tracked (e.g., via a follow-up issue or the referenced #66762) so the dependency can be switched back to github.com/pingcap/kvproto once the upstream kvproto PR is merged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DEPS.bzl` around lines 6816 - 6826, The DEPS.bzl entry for
name="com_github_pingcap_kvproto" currently points to the JaySon-Huang fork to
consume TaskMeta.sql_digest; create a tracked follow-up (issue or link to
`#66762`) and add a TODO comment in the same DEPS.bzl entry referencing that
tracking issue/PR; when upstream kvproto is merged, update importpath back to
"github.com/pingcap/kvproto", replace strip_prefix, sha256 and the urls to the
official upstream release, and document the change in the follow-up so we can
revert the forked dependency cleanly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Around line 378-379: The go.mod contains a temporary replace directive
pointing to a personal fork (github.com/JaySon-Huang/kvproto) which must be
removed before merge; once the upstream pingcap/kvproto contains the
TaskMeta.SqlDigest field, delete the replace line and update the module
dependency to the canonical github.com/pingcap/kvproto module (ensuring the
version includes TaskMeta.SqlDigest) so imports that reference
TaskMeta.SqlDigest resolve to the official upstream package rather than the
fork.

---

Nitpick comments:
In `@DEPS.bzl`:
- Around line 6816-6826: The DEPS.bzl entry for
name="com_github_pingcap_kvproto" currently points to the JaySon-Huang fork to
consume TaskMeta.sql_digest; create a tracked follow-up (issue or link to
`#66762`) and add a TODO comment in the same DEPS.bzl entry referencing that
tracking issue/PR; when upstream kvproto is merged, update importpath back to
"github.com/pingcap/kvproto", replace strip_prefix, sha256 and the urls to the
official upstream release, and document the change in the follow-up so we can
revert the forked dependency cleanly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 645f4ad1-6b39-4415-a206-04e789ae722a

📥 Commits

Reviewing files that changed from the base of the PR and between 999e8f4 and 528b1a7.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • DEPS.bzl
  • go.mod
  • pkg/executor/internal/mpp/local_mpp_coordinator.go
  • pkg/kv/mpp.go
  • pkg/store/copr/mpp.go

@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

/cc @windtalker @gengliqi PTAL. Can we push forward the kvproto changes and these tidb changes?

@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

Copy link
Copy Markdown
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this pr has many unrelated changes?

@JaySon-Huang JaySon-Huang force-pushed the feat/mpp-sql-digest-propagation branch from 7b82377 to 223dbcb Compare March 17, 2026 13:09
@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

JaySon-Huang commented Mar 17, 2026

why this pr has many unrelated changes?

@windtalker I think the "many unrelated changes" you means is so many line changes in the file go.sum? I think it is caused by kvproto refer to a non merged commit in kvproto. I see the same situation in this PR when using a temporary branch too: e9fe056

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@DEPS.bzl`:
- Around line 6819-6825: Remove the temporary replace directive that pins
github.com/JaySon-Huang/kvproto in go.mod (the replace at or around go.mod:379),
revert it to the official github.com/pingcap/kvproto module, then regenerate
Bazel metadata by running make bazel_prepare to update DEPS.bzl; do not manually
edit DEPS.bzl—ensure the replace is removed from go.mod first so the generated
DEPS.bzl references the official kvproto.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 222190c5-bec1-42f8-b1bc-04cb4a0bc74f

📥 Commits

Reviewing files that changed from the base of the PR and between 7b82377 and 369d3a1.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • DEPS.bzl
  • go.mod
  • pkg/executor/internal/mpp/local_mpp_coordinator.go
  • pkg/kv/mpp.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod

@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 17, 2026
@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

kvproto has been upgraded to a commit on pingcap/kvproto. @windtalker @gengliqi PTAL

@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow bot commented Mar 17, 2026

@JaySon-Huang: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 77.5755%. Comparing base (b52c9b8) to head (8e8bd74).
⚠️ Report is 16 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #66763        +/-   ##
================================================
- Coverage   77.7082%   77.5755%   -0.1328%     
================================================
  Files          2016       1939        -77     
  Lines        552461     543568      -8893     
================================================
- Hits         429308     421676      -7632     
- Misses       121411     121890       +479     
+ Partials       1742          2      -1740     
Flag Coverage Δ
integration 41.0055% <0.0000%> (-7.1172%) ⬇️
unit 76.7063% <96.1538%> (+0.4771%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 63.6984% <ø> (+6.6885%) ⬆️
parser ∅ <ø> (∅)
br 49.4598% <ø> (-11.3927%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JaySon-Huang JaySon-Huang force-pushed the feat/mpp-sql-digest-propagation branch from 9ca1e48 to 8e8bd74 Compare March 19, 2026 04:46
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/store/copr/mpp.go (1)

204-206: Consider multi-line formatting for readability.

This single-line struct initialization is quite long and harder to scan compared to the multi-line style used in DispatchMPPTask (lines 105-116). Consider aligning with the existing style for consistency.

💅 Proposed formatting
 	firstReq := reqs[0]
 	killReq := &mpp.CancelTaskRequest{
-		Meta: &mpp.TaskMeta{StartTs: firstReq.StartTs, GatherId: firstReq.GatherID, QueryTs: firstReq.MppQueryID.QueryTs, LocalQueryId: firstReq.MppQueryID.LocalQueryID, ServerId: firstReq.MppQueryID.ServerID, MppVersion: firstReq.MppVersion.ToInt64(), ResourceGroupName: firstReq.ResourceGroupName, SqlDigest: firstReq.SQLDigest, PlanDigest: firstReq.PlanDigest},
+		Meta: &mpp.TaskMeta{
+			StartTs:           firstReq.StartTs,
+			GatherId:          firstReq.GatherID,
+			QueryTs:           firstReq.MppQueryID.QueryTs,
+			LocalQueryId:      firstReq.MppQueryID.LocalQueryID,
+			ServerId:          firstReq.MppQueryID.ServerID,
+			MppVersion:        firstReq.MppVersion.ToInt64(),
+			ResourceGroupName: firstReq.ResourceGroupName,
+			SqlDigest:         firstReq.SQLDigest,
+			PlanDigest:        firstReq.PlanDigest,
+		},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/store/copr/mpp.go` around lines 204 - 206, The one-line initialization of
killReq with mpp.CancelTaskRequest and its nested mpp.TaskMeta is hard to read;
refactor the creation of killReq into multi-line, aligned field assignments (one
field per line) similar to DispatchMPPTask, breaking out Meta: &mpp.TaskMeta{
... } across multiple lines and listing StartTs, GatherId, QueryTs,
LocalQueryId, ServerId, MppVersion, ResourceGroupName, SqlDigest, and PlanDigest
each on its own line using firstReq and firstReq.MppQueryID where appropriate to
improve readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/store/copr/mpp.go`:
- Around line 204-206: The one-line initialization of killReq with
mpp.CancelTaskRequest and its nested mpp.TaskMeta is hard to read; refactor the
creation of killReq into multi-line, aligned field assignments (one field per
line) similar to DispatchMPPTask, breaking out Meta: &mpp.TaskMeta{ ... } across
multiple lines and listing StartTs, GatherId, QueryTs, LocalQueryId, ServerId,
MppVersion, ResourceGroupName, SqlDigest, and PlanDigest each on its own line
using firstReq and firstReq.MppQueryID where appropriate to improve readability.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 20b20043-f554-4e5a-8860-32efceaf355d

📥 Commits

Reviewing files that changed from the base of the PR and between 9ca1e48 and 8e8bd74.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • DEPS.bzl
  • go.mod
  • pkg/executor/internal/mpp/local_mpp_coordinator.go
  • pkg/kv/mpp.go
  • pkg/store/copr/mpp.go
✅ Files skipped from review due to trivial changes (1)
  • go.mod
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/kv/mpp.go
  • DEPS.bzl

@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow bot commented Mar 19, 2026

@JaySon-Huang: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@JaySon-Huang
Copy link
Copy Markdown
Contributor Author

/test unit-test

@tiprow
Copy link
Copy Markdown

tiprow bot commented Mar 22, 2026

@JaySon-Huang: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/test unit-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants