Skip to content

[Feat][Shuffle] Add a tool that can parse shuffle data#200

Open
wangxinshuo-bolt wants to merge 3 commits intobytedance:mainfrom
wangxinshuo-bolt:print_shuffle
Open

[Feat][Shuffle] Add a tool that can parse shuffle data#200
wangxinshuo-bolt wants to merge 3 commits intobytedance:mainfrom
wangxinshuo-bolt:print_shuffle

Conversation

@wangxinshuo-bolt
Copy link
Collaborator

@wangxinshuo-bolt wangxinshuo-bolt commented Feb 4, 2026

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

In some troubleshooting of data inconsistencies, we often suspect that the shuffle write data is incorrect, but there is no way to visually verify whether the shuffle data is corrupted. Therefore, I added this tool to parse shuffle data.

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    Paste your google-benchmark or TPC-H results here.
    Before: 10.5s
    After:   8.2s  (+20%)
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Checklist (For Author)

  • I have added/updated unit tests (ctest).
  • I have verified the code with local build (Release/Debug).
  • I have run clang-format / linters.
  • (Optional) I have run Sanitizers (ASAN/TSAN) locally for complex C++ changes.
  • No need to test or manual test.

Breaking Changes

  • No

  • Yes (Description: ...)

    Click to view Breaking Changes
    Breaking Changes:
    - Description of the breaking change.
    - Possible solutions or workarounds.
    - Any other relevant information.
    

@wangxinshuo-bolt wangxinshuo-bolt changed the title Add printShuffleFile function [Feat][Shuffle] Add printShuffleFile function Feb 9, 2026
@wangxinshuo-bolt wangxinshuo-bolt changed the title [Feat][Shuffle] Add printShuffleFile function [Feat][Shuffle] Add a tool that can parse shuffle data Feb 9, 2026
@wangxinshuo-bolt wangxinshuo-bolt marked this pull request as ready for review February 9, 2026 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant