Skip to content

feat: sort file groups by statistics during sort pushdown (Sort pushdown phase 2)#21182

Open
zhuqi-lucas wants to merge 6 commits intoapache:mainfrom
zhuqi-lucas:feat/sort-file-groups-by-statistics
Open

feat: sort file groups by statistics during sort pushdown (Sort pushdown phase 2)#21182
zhuqi-lucas wants to merge 6 commits intoapache:mainfrom
zhuqi-lucas:feat/sort-file-groups-by-statistics

Conversation

@zhuqi-lucas
Copy link
Copy Markdown
Contributor

@zhuqi-lucas zhuqi-lucas commented Mar 26, 2026

Which issue does this PR close?

Closes #17348
Closes #19329

Rationale for this change

This PR implements the core optimization described in the EPIC Sort pushdown / partially sorted scans: using file-level min/max statistics to optimize scan order and eliminate unnecessary sort operators.

Currently, when a query has ORDER BY, DataFusion always inserts a SortExec even when the data is already sorted across files. This PR enables:

  1. Sort elimination when files are non-overlapping and internally sorted
  2. Statistics-based file reordering to approximate the requested order
  3. Automatic ordering inference from Parquet sorting_columns metadata (no WITH ORDER needed)

What changes are included in this PR?

Architecture

Query: SELECT ... ORDER BY col ASC [LIMIT N]

  PushdownSort optimizer
        │
        ▼
  FileScanConfig::try_pushdown_sort()
        │
        ├─► FileSource::try_pushdown_sort()
        │     │
        │     ├─ natural ordering matches? ──► Exact
        │     │   (Parquet WITH ORDER or              │
        │     │    inferred from metadata)             ▼
        │     │                           rebuild_with_source(exact=true)
        │     │                             ├─ sort files by min/max stats
        │     │                             ├─ verify non-overlapping
        │     │                             ├─ redistribute across groups
        │     │                             └─► keep output_ordering
        │     │                                  → SortExec removed
        │     │
        │     ├─ reversed ordering? ──► Inexact
        │     │   (reverse_row_groups)        │
        │     │                                ▼
        │     │                    rebuild_with_source(exact=false)
        │     │                      └─► clear output_ordering
        │     │                           → SortExec kept
        │     │
        │     └─ neither ──► Unsupported
        │
        └─► try_sort_file_groups_by_statistics()
              (best-effort: reorder files by stats)
              └─► Inexact if reordered

Three Optimization Paths

Path 1: Sort Elimination (Exact) — removes SortExec entirely

When the file source's natural ordering satisfies the query (e.g., Parquet files with sorting_columns metadata), and files within each group are non-overlapping, the SortExec is completely eliminated.

Before:                              After:
  SortExec [col ASC]                   DataSourceExec [files sorted]
    DataSourceExec [files]             (output_ordering=[col ASC])

Path 2: Reverse Scan (Inexact) — existing optimization, enhanced

When the requested order is the reverse of the natural ordering, reverse_row_groups=true is set. SortExec stays but benefits from approximate ordering.

Path 3: Statistics-Based File Reordering — new fallback

When the FileSource returns Unsupported, files are reordered by their min/max statistics to approximate the requested order. This benefits TopK queries via better dynamic filter pruning.

Multi-Partition Design

For multiple execution partitions, the optimization works per-partition:

Multi-partition (each partition's SortExec eliminated):
  SortPreservingMergeExec [col ASC]        ← O(n) merge, cheap
    DataSourceExec [group 0: f1, f2]       ← no SortExec, parallel I/O
    DataSourceExec [group 1: f3, f4]       ← no SortExec, parallel I/O

When bin-packing interleaves file ranges across groups, files are redistributed using consecutive assignment to ensure groups are ordered relative to each other:

Before (bin-packed, interleaved):
  Group 0: [f1(0-9),  f3(20-29)]     groups overlap!
  Group 1: [f2(10-19), f4(30-39)]

After (consecutive assignment):
  Group 0: [f1(0-9),  f2(10-19)]     max=19
  Group 1: [f3(20-29), f4(30-39)]    min=20 > 19 ✓ ordered!

Automatic Ordering Inference

DataFusion already infers ordering from Parquet sorting_columns metadata (via ordering_from_parquet_metadata). With this PR, the inferred ordering flows through sort pushdown automatically — users don't need WITH ORDER for sorted Parquet files.

Files Changed

File Change
datasource-parquet/src/source.rs ParquetSource returns Exact when natural ordering satisfies request
datasource/src/file_scan_config.rs Core sort pushdown logic: statistics sorting, non-overlapping detection, multi-group redistribution
physical-optimizer/src/pushdown_sort.rs Module documentation update
core/tests/physical_optimizer/pushdown_sort.rs Updated prefix match test
sqllogictest/test_files/sort_pushdown.slt 5 new test groups (A-E) + updated existing tests
benchmarks/src/sort_pushdown.rs New benchmark for sort elimination
benchmarks/{lib,bin/dfbench,bench}.{rs,sh} Benchmark registration

Benchmark Results

300k rows, 8 non-overlapping sorted parquet files, single partition:

Query Description Baseline (ms) Sort Eliminated (ms) Speedup
Q1 ORDER BY col ASC (full scan) 159 91 43%
Q2 ORDER BY col ASC LIMIT 100 36 12 67%
Q3 ORDER BY col ASC (wide, SELECT *) 487 333 31%
Q4 ORDER BY col ASC LIMIT 100 (wide) 119 30 74%

LIMIT queries benefit most (67-74%) because sort elimination + limit pushdown means only the first few rows are read.

Tests

Unit Tests (12 new)

  • Unsupported/Inexact/Exact source × sorted/unsorted/overlapping/non-overlapping
  • Multi-group consecutive redistribution (even and uneven distribution)
  • Partial statistics, single-file groups, descending sort

SLT Integration Tests (5 new groups)

  • Test A: Non-overlapping files + WITH ORDER → Sort eliminated (single partition)
  • Test B: Overlapping files → statistics reorder, SortExec retained
  • Test C: LIMIT queries (ASC sort elimination + DESC reverse scan)
  • Test D: target_partitions=2 → SPM + per-partition sort elimination
  • Test E: Inferred ordering from Parquet metadata (no WITH ORDER) — single and multi partition

Integration Tests

  • Updated prefix match test for Exact pushdown behavior
  • All 919 core integration tests pass, all existing SLT tests pass

Test plan

  • cargo test -p datafusion-datasource (111 tests pass)
  • cargo test -p datafusion-datasource-parquet (96 tests pass)
  • cargo test -p datafusion-physical-optimizer (27 tests pass)
  • cargo test -p datafusion --test core_integration (919 tests pass)
  • cargo test -p datafusion all tests (1997+ pass)
  • SLT sort/order/topk tests pass
  • SLT window/union/joins tests pass (no regressions)
  • cargo clippy — 0 warnings
  • Benchmark runs and shows expected speedups

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 26, 2026 15:54
@zhuqi-lucas
Copy link
Copy Markdown
Contributor Author

cc @alamb @adriangb — this implements the sort pushdown phase 2 from #17348. Would appreciate your review when you get a chance.

@zhuqi-lucas zhuqi-lucas changed the title feat: sort file groups by statistics during sort pushdown feat: sort file groups by statistics during sort pushdown (Sort pushdown phase 2) Mar 26, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements statistics-driven file group ordering as part of sort pushdown, enabling sort elimination when within-file ordering matches and files are non-overlapping, plus a best-effort stats-based reorder fallback when exact pushdown isn’t possible.

Changes:

  • Add file-group reordering by min/max statistics (and non-overlap validation) to enable SortExec elimination for exactly ordered, non-overlapping files.
  • Extend Parquet sort pushdown to return Exact when Parquet ordering metadata satisfies the requested ordering.
  • Add/adjust SLT + Rust tests and a new benchmark to validate and measure the optimization.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
datafusion/sqllogictest/test_files/sort_pushdown.slt Adds SLT coverage for stats reorder, sort elimination, LIMIT behavior, multi-partition behavior, and inferred ordering from Parquet metadata.
datafusion/physical-optimizer/src/pushdown_sort.rs Updates module docs to reflect new capabilities (Exact elimination + stats-based ordering).
datafusion/datasource/src/file_scan_config.rs Implements core stats-based reordering, non-overlap validation, “exact” preservation logic, and cross-group redistribution. Adds unit tests.
datafusion/datasource-parquet/src/source.rs Returns Exact when Parquet natural ordering satisfies the requested sort.
datafusion/core/tests/physical_optimizer/pushdown_sort.rs Updates a prefix-match test to reflect Exact pushdown / sort elimination behavior.
benchmarks/src/sort_pushdown.rs Adds a benchmark to measure sort elimination and LIMIT benefits on sorted, non-overlapping parquet files.
benchmarks/src/lib.rs Registers the new sort_pushdown benchmark module.
benchmarks/src/bin/dfbench.rs Exposes sort-pushdown as a new dfbench subcommand.
benchmarks/bench.sh Adds bench.sh targets to run the new sort pushdown benchmarks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) datasource Changes to the datasource crate labels Mar 26, 2026
@adriangb
Copy link
Copy Markdown
Contributor

Very exciting! I hope I have wifi on the plane later today so I can review.

Copy link
Copy Markdown
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some comment for now. My flight ended up being a 4 hour delay on the tarmac debacle.


use datafusion_benchmarks::{
cancellation, clickbench, h2o, hj, imdb, nlj, smj, sort_tpch, tpcds, tpch,
cancellation, clickbench, h2o, hj, imdb, nlj, smj, sort_pushdown, sort_tpch, tpcds,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding sort_pushdown. To keep the PR smaller and so we can run comparison benchmarks, could you split the benchmarks out into their own PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Good idea — will split the benchmarks into a follow-up PR to keep this one focused on the core optimization.

inner: Arc::new(new_source) as Arc<dyn FileSource>,
})

// TODO Phase 2: Add support for other optimizations:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think there's one more trick we could have up our sleeves: instead of only reversing row group orders we could pass the desired sort order into the opener and have it re-sort the row groups based on stats to try to match the scan's desired ordering. This might be especially effective once we have morselized scans since we could terminate after a single row group for TopK queries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! Row-group-level statistics reordering would be a natural extension of our file-level reordering but at finer granularity. Especially powerful with morselized scans where TopK could terminate after a single row group. Will track as a follow-up.

Comment on lines +1650 to +1652
###############################################################
# Statistics-based file sorting and sort elimination tests
###############################################################
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also commit these alongside the benchmarks so we can then look at just the diff.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. The SLT changes to existing tests (updated EXPLAIN outputs showing reordered files) need to stay with the core PR since they validate the new behavior. I will split just the benchmark code into its own PR as suggested above.

///
/// # Sort Pushdown Architecture
///
/// ```text
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagram is amazing, thank you so much for the detailed docs!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking the time to review! Really appreciate it.

/// │
/// └─► try_sort_file_groups_by_statistics()
/// (best-effort: reorder files by min/max stats)
/// └─► Inexact if reordered, Unsupported if already in order
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsupported if already in order

I didn't understand this part

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — let me clarify. When FileSource returns Unsupported, we fall back to try_sort_file_groups_by_statistics() which reorders files by min/max stats. But if the files are already in the correct order (any_reordered = false), we return Unsupported rather than Inexact — because we did not actually change anything. Returning Inexact would make the optimizer think we optimized the plan, but it is identical to the original. Will improve the wording in the comment.

Comment on lines +1418 to +1427
// When there are multiple groups, redistribute files using consecutive
// assignment so that each group remains non-overlapping AND groups are
// ordered relative to each other. This enables:
// - No SortExec per partition (files in each group are sorted & non-overlapping)
// - SPM cheaply merges ordered streams (O(n) merge)
// - Parallel I/O across partitions
//
// Before (bin-packing may interleave):
// Group 0: [file_01(1-10), file_03(21-30)] ← gap, interleaved with group 1
// Group 1: [file_02(11-20), file_04(31-40)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there scenarios where ending up with lopsided partitions negates the benefits?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice the impact is minimal: file count is typically much larger than partition count, so the imbalance is at most 1 extra file (e.g. 51 vs 50 files). Even with some imbalance, parallel I/O across partitions still beats single-partition sequential reads. For LIMIT queries it matters even less since the first partition hits the limit and stops early regardless of size.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further analysis, I am considering removing the redistribution logic entirely. The three benefits listed in the comment are not actually unique to redistribution:

  1. No SortExec per partition — true regardless of redistribution, as long as files within each group are non-overlapping
  2. SPM cheaply merges ordered streams — SPM is O(n) merge whether groups are interleaved or consecutive
  3. Parallel I/O across partitions — actually better with interleaved groups, since SPM alternates pulling from both partitions, keeping both I/O streams active

The only real difference is that consecutive assignment makes each partition's file reads more sequential (fewer open/close alternations). But interleaved groups give better I/O parallelism because both partitions are actively scanning simultaneously.

Given the marginal benefit vs added complexity (new function + tests), I think we should remove redistribute_files_across_groups_by_statistics and just keep the core optimization: per-partition sort elimination via statistics-based non-overlapping detection.

What do you think?

@adriangb
Copy link
Copy Markdown
Contributor

run benchmarks

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4141601401-576-882hz 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/sort-file-groups-by-statistics (a79cbdf) to 7cbc6b4 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4141601401-577-k58hp 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/sort-file-groups-by-statistics (a79cbdf) to 7cbc6b4 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4141601401-578-ndf9j 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/sort-file-groups-by-statistics (a79cbdf) to 7cbc6b4 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and feat_sort-file-groups-by-statistics
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃ feat_sort-file-groups-by-statistics ┃    Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1  │ 45.25 / 45.93 ±0.93 / 47.76 ms │      45.34 / 45.75 ±0.65 / 47.05 ms │ no change │
│ QQuery 2  │ 21.14 / 21.63 ±0.62 / 22.83 ms │      21.14 / 21.55 ±0.47 / 22.45 ms │ no change │
│ QQuery 3  │ 31.68 / 31.92 ±0.13 / 32.04 ms │      31.99 / 32.54 ±0.81 / 34.09 ms │ no change │
│ QQuery 4  │ 20.32 / 21.44 ±0.96 / 22.85 ms │      20.42 / 21.08 ±0.53 / 21.99 ms │ no change │
│ QQuery 5  │ 47.55 / 49.00 ±2.12 / 53.20 ms │      50.20 / 51.12 ±0.47 / 51.54 ms │ no change │
│ QQuery 6  │ 17.10 / 17.22 ±0.14 / 17.48 ms │      17.15 / 17.34 ±0.17 / 17.63 ms │ no change │
│ QQuery 7  │ 54.14 / 55.47 ±1.34 / 57.58 ms │      53.89 / 54.61 ±0.45 / 55.21 ms │ no change │
│ QQuery 8  │ 48.09 / 48.50 ±0.25 / 48.85 ms │      48.06 / 48.59 ±0.28 / 48.85 ms │ no change │
│ QQuery 9  │ 54.15 / 55.30 ±1.17 / 56.82 ms │      53.57 / 55.15 ±2.09 / 59.28 ms │ no change │
│ QQuery 10 │ 71.19 / 72.23 ±0.91 / 73.62 ms │      71.46 / 72.22 ±0.72 / 73.52 ms │ no change │
│ QQuery 11 │ 14.01 / 14.51 ±0.83 / 16.16 ms │      13.98 / 14.23 ±0.17 / 14.42 ms │ no change │
│ QQuery 12 │ 27.75 / 28.20 ±0.70 / 29.58 ms │      27.86 / 28.87 ±1.35 / 31.54 ms │ no change │
│ QQuery 13 │ 37.80 / 38.71 ±0.58 / 39.46 ms │      38.74 / 39.39 ±0.43 / 40.03 ms │ no change │
│ QQuery 14 │ 28.58 / 28.91 ±0.33 / 29.51 ms │      28.49 / 29.07 ±0.56 / 30.14 ms │ no change │
│ QQuery 15 │ 33.29 / 34.47 ±1.06 / 36.28 ms │      33.69 / 34.42 ±0.54 / 34.97 ms │ no change │
│ QQuery 16 │ 15.73 / 16.10 ±0.31 / 16.63 ms │      15.93 / 16.10 ±0.17 / 16.31 ms │ no change │
│ QQuery 17 │ 72.73 / 73.35 ±0.48 / 73.84 ms │      72.65 / 73.29 ±0.63 / 74.15 ms │ no change │
│ QQuery 18 │ 76.51 / 77.81 ±0.95 / 79.25 ms │      77.26 / 78.34 ±0.80 / 79.11 ms │ no change │
│ QQuery 19 │ 37.26 / 37.75 ±0.46 / 38.41 ms │      37.19 / 37.70 ±0.42 / 38.20 ms │ no change │
│ QQuery 20 │ 40.35 / 40.58 ±0.20 / 40.81 ms │      39.93 / 40.83 ±0.75 / 42.02 ms │ no change │
│ QQuery 21 │ 63.20 / 64.31 ±0.90 / 65.87 ms │      63.67 / 65.55 ±1.15 / 66.76 ms │ no change │
│ QQuery 22 │ 17.84 / 18.46 ±0.63 / 19.55 ms │      17.74 / 18.09 ±0.27 / 18.41 ms │ no change │
└───────────┴────────────────────────────────┴─────────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                                  ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                                  │ 891.79ms │
│ Total Time (feat_sort-file-groups-by-statistics)   │ 895.87ms │
│ Average Time (HEAD)                                │  40.54ms │
│ Average Time (feat_sort-file-groups-by-statistics) │  40.72ms │
│ Queries Faster                                     │        0 │
│ Queries Slower                                     │        0 │
│ Queries with No Change                             │       22 │
│ Queries with Failure                               │        0 │
└────────────────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 4.7s
Peak memory 4.0 GiB
Avg memory 3.6 GiB
CPU user 33.1s
CPU sys 2.9s
Disk read 0 B
Disk write 136.0 KiB

tpch — branch

Metric Value
Wall time 4.7s
Peak memory 4.1 GiB
Avg memory 3.6 GiB
CPU user 33.2s
CPU sys 3.0s
Disk read 0 B
Disk write 72.0 KiB

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and feat_sort-file-groups-by-statistics
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃      feat_sort-file-groups-by-statistics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           44.02 / 44.88 ±0.68 / 46.00 ms │           43.49 / 44.07 ±0.80 / 45.60 ms │     no change │
│ QQuery 2  │        145.00 / 146.83 ±1.64 / 149.04 ms │        146.23 / 147.31 ±0.77 / 148.21 ms │     no change │
│ QQuery 3  │        114.94 / 116.37 ±1.06 / 117.51 ms │        114.99 / 115.68 ±0.65 / 116.85 ms │     no change │
│ QQuery 4  │    1341.58 / 1361.40 ±15.96 / 1387.53 ms │    1315.35 / 1341.73 ±22.34 / 1379.48 ms │     no change │
│ QQuery 5  │        172.28 / 173.70 ±0.86 / 174.63 ms │        174.38 / 176.65 ±1.49 / 178.75 ms │     no change │
│ QQuery 6  │    1021.69 / 1050.40 ±26.62 / 1094.54 ms │    1028.33 / 1065.22 ±22.58 / 1098.29 ms │     no change │
│ QQuery 7  │        357.39 / 362.84 ±4.75 / 370.20 ms │        355.20 / 357.14 ±1.53 / 358.45 ms │     no change │
│ QQuery 8  │        117.02 / 118.32 ±0.84 / 119.63 ms │        116.28 / 117.48 ±0.93 / 118.70 ms │     no change │
│ QQuery 9  │        101.91 / 106.97 ±6.16 / 118.31 ms │       102.49 / 110.09 ±10.34 / 130.28 ms │     no change │
│ QQuery 10 │        108.44 / 110.12 ±0.86 / 110.79 ms │        109.28 / 111.17 ±1.23 / 113.04 ms │     no change │
│ QQuery 11 │      970.60 / 984.53 ±10.65 / 1002.55 ms │       900.92 / 916.07 ±12.11 / 932.59 ms │ +1.07x faster │
│ QQuery 12 │           46.33 / 47.74 ±1.27 / 49.62 ms │           45.74 / 46.11 ±0.26 / 46.46 ms │     no change │
│ QQuery 13 │        399.74 / 406.08 ±4.56 / 412.16 ms │        401.23 / 405.03 ±2.24 / 407.75 ms │     no change │
│ QQuery 14 │     1025.97 / 1032.56 ±3.83 / 1036.90 ms │     1024.30 / 1036.16 ±7.79 / 1047.17 ms │     no change │
│ QQuery 15 │           15.96 / 16.83 ±0.80 / 18.33 ms │           15.61 / 16.88 ±1.26 / 19.29 ms │     no change │
│ QQuery 16 │           41.68 / 42.37 ±0.62 / 43.47 ms │           41.34 / 42.60 ±1.36 / 45.23 ms │     no change │
│ QQuery 17 │        241.81 / 243.68 ±1.16 / 245.16 ms │        242.54 / 243.88 ±1.33 / 245.61 ms │     no change │
│ QQuery 18 │        129.12 / 131.11 ±1.42 / 133.16 ms │        129.08 / 130.17 ±0.59 / 130.74 ms │     no change │
│ QQuery 19 │        155.65 / 158.64 ±1.70 / 160.63 ms │        158.98 / 160.08 ±1.04 / 161.68 ms │     no change │
│ QQuery 20 │           14.09 / 14.64 ±0.45 / 15.32 ms │           14.15 / 14.81 ±0.69 / 16.12 ms │     no change │
│ QQuery 21 │           19.80 / 20.29 ±0.48 / 21.16 ms │           20.23 / 20.88 ±0.77 / 22.29 ms │     no change │
│ QQuery 22 │        490.50 / 495.75 ±4.75 / 504.62 ms │        481.56 / 486.06 ±4.01 / 492.21 ms │     no change │
│ QQuery 23 │       899.07 / 911.20 ±10.71 / 924.34 ms │        894.57 / 898.55 ±3.01 / 903.73 ms │     no change │
│ QQuery 24 │        420.82 / 423.90 ±2.41 / 427.52 ms │        414.21 / 418.97 ±3.22 / 423.87 ms │     no change │
│ QQuery 25 │        359.15 / 361.11 ±1.47 / 362.88 ms │        355.09 / 357.18 ±2.17 / 360.04 ms │     no change │
│ QQuery 26 │           80.92 / 83.54 ±2.22 / 87.26 ms │           82.96 / 84.21 ±1.30 / 85.93 ms │     no change │
│ QQuery 27 │        350.83 / 353.79 ±2.57 / 357.70 ms │        350.86 / 352.55 ±1.36 / 354.47 ms │     no change │
│ QQuery 28 │        149.60 / 150.42 ±0.69 / 151.68 ms │        149.03 / 151.68 ±1.97 / 154.36 ms │     no change │
│ QQuery 29 │        301.58 / 303.32 ±1.71 / 306.48 ms │        299.18 / 302.15 ±2.38 / 305.11 ms │     no change │
│ QQuery 30 │           43.14 / 45.33 ±1.67 / 47.60 ms │           43.70 / 45.48 ±1.45 / 47.86 ms │     no change │
│ QQuery 31 │        170.73 / 174.14 ±2.65 / 178.49 ms │        173.92 / 175.20 ±0.72 / 176.12 ms │     no change │
│ QQuery 32 │           58.51 / 58.80 ±0.31 / 59.36 ms │           57.73 / 58.09 ±0.31 / 58.48 ms │     no change │
│ QQuery 33 │        141.23 / 143.57 ±1.65 / 145.76 ms │        142.02 / 143.39 ±1.32 / 145.86 ms │     no change │
│ QQuery 34 │        105.68 / 109.16 ±2.34 / 112.98 ms │        107.45 / 108.37 ±0.67 / 109.51 ms │     no change │
│ QQuery 35 │        108.31 / 110.76 ±1.60 / 112.99 ms │        104.69 / 109.91 ±3.35 / 113.80 ms │     no change │
│ QQuery 36 │        219.25 / 224.16 ±3.52 / 228.59 ms │        211.95 / 218.93 ±4.59 / 225.52 ms │     no change │
│ QQuery 37 │        178.14 / 181.78 ±3.07 / 186.48 ms │        179.23 / 181.10 ±1.12 / 182.55 ms │     no change │
│ QQuery 38 │           86.93 / 89.92 ±2.76 / 94.83 ms │           84.63 / 89.30 ±3.28 / 92.83 ms │     no change │
│ QQuery 39 │        128.96 / 130.99 ±1.08 / 132.20 ms │        127.22 / 129.67 ±1.26 / 130.67 ms │     no change │
│ QQuery 40 │        114.88 / 119.88 ±6.93 / 133.55 ms │        113.90 / 118.59 ±5.25 / 128.03 ms │     no change │
│ QQuery 41 │           14.63 / 15.50 ±0.82 / 16.68 ms │           14.39 / 15.63 ±0.90 / 16.68 ms │     no change │
│ QQuery 42 │        108.95 / 111.09 ±1.78 / 114.13 ms │        108.38 / 111.04 ±1.87 / 113.63 ms │     no change │
│ QQuery 43 │           84.89 / 85.89 ±0.90 / 87.38 ms │           84.87 / 85.12 ±0.22 / 85.47 ms │     no change │
│ QQuery 44 │           11.93 / 12.43 ±0.57 / 13.49 ms │           11.58 / 12.36 ±1.01 / 14.34 ms │     no change │
│ QQuery 45 │           52.88 / 54.82 ±1.11 / 56.03 ms │           51.43 / 53.29 ±1.23 / 55.07 ms │     no change │
│ QQuery 46 │        231.56 / 235.66 ±3.68 / 240.49 ms │        231.04 / 235.07 ±2.54 / 238.09 ms │     no change │
│ QQuery 47 │        699.32 / 708.94 ±9.38 / 724.80 ms │        682.58 / 696.11 ±7.01 / 702.30 ms │     no change │
│ QQuery 48 │        286.29 / 288.94 ±1.49 / 290.59 ms │        285.58 / 291.70 ±3.86 / 296.60 ms │     no change │
│ QQuery 49 │        256.45 / 259.33 ±2.88 / 264.42 ms │        254.71 / 259.25 ±2.56 / 261.77 ms │     no change │
│ QQuery 50 │        231.40 / 237.33 ±4.06 / 243.71 ms │        228.61 / 235.93 ±4.79 / 240.64 ms │     no change │
│ QQuery 51 │        180.77 / 185.21 ±2.98 / 189.43 ms │        178.87 / 183.95 ±3.23 / 187.53 ms │     no change │
│ QQuery 52 │        109.34 / 111.06 ±1.13 / 112.85 ms │        108.81 / 111.30 ±2.29 / 115.35 ms │     no change │
│ QQuery 53 │        103.70 / 105.12 ±0.93 / 106.56 ms │        104.22 / 105.23 ±0.92 / 106.58 ms │     no change │
│ QQuery 54 │        149.29 / 150.69 ±0.91 / 151.98 ms │        148.47 / 150.54 ±1.79 / 153.22 ms │     no change │
│ QQuery 55 │        108.39 / 109.13 ±0.91 / 110.53 ms │        109.10 / 110.29 ±1.72 / 113.65 ms │     no change │
│ QQuery 56 │        142.65 / 143.71 ±0.78 / 144.94 ms │        143.57 / 145.04 ±0.78 / 145.87 ms │     no change │
│ QQuery 57 │        173.64 / 177.69 ±2.53 / 181.29 ms │        172.27 / 174.91 ±1.49 / 176.79 ms │     no change │
│ QQuery 58 │        302.41 / 305.31 ±2.35 / 308.98 ms │        307.18 / 312.25 ±5.09 / 321.73 ms │     no change │
│ QQuery 59 │        198.81 / 202.17 ±3.11 / 207.62 ms │        197.42 / 201.13 ±2.44 / 204.45 ms │     no change │
│ QQuery 60 │        144.61 / 146.75 ±1.67 / 148.47 ms │        147.19 / 150.76 ±5.75 / 162.23 ms │     no change │
│ QQuery 61 │        173.46 / 174.20 ±0.82 / 175.25 ms │        170.42 / 173.07 ±1.55 / 175.07 ms │     no change │
│ QQuery 62 │      917.35 / 976.62 ±38.37 / 1018.85 ms │       914.54 / 934.46 ±14.57 / 948.84 ms │     no change │
│ QQuery 63 │        106.24 / 107.94 ±1.73 / 110.22 ms │        106.00 / 108.74 ±2.15 / 111.80 ms │     no change │
│ QQuery 64 │        707.02 / 713.96 ±4.26 / 719.13 ms │        698.01 / 705.85 ±4.08 / 709.96 ms │     no change │
│ QQuery 65 │        250.04 / 257.16 ±3.92 / 260.70 ms │        250.29 / 256.38 ±3.34 / 259.06 ms │     no change │
│ QQuery 66 │        245.16 / 251.25 ±6.23 / 261.55 ms │       243.65 / 257.30 ±14.72 / 283.65 ms │     no change │
│ QQuery 67 │        313.26 / 319.89 ±6.19 / 331.07 ms │        311.28 / 315.51 ±3.42 / 321.14 ms │     no change │
│ QQuery 68 │        279.74 / 284.48 ±2.74 / 288.19 ms │        279.59 / 283.35 ±3.33 / 289.44 ms │     no change │
│ QQuery 69 │        102.53 / 104.40 ±1.03 / 105.51 ms │        104.22 / 105.23 ±1.24 / 107.61 ms │     no change │
│ QQuery 70 │        348.61 / 352.06 ±5.01 / 361.75 ms │       350.96 / 366.09 ±15.00 / 392.96 ms │     no change │
│ QQuery 71 │        136.19 / 138.97 ±2.18 / 141.54 ms │        135.78 / 138.94 ±2.01 / 141.80 ms │     no change │
│ QQuery 72 │        710.85 / 718.42 ±7.67 / 732.93 ms │        710.49 / 716.02 ±4.11 / 721.62 ms │     no change │
│ QQuery 73 │        102.72 / 105.57 ±1.76 / 107.55 ms │        103.63 / 104.99 ±0.98 / 106.64 ms │     no change │
│ QQuery 74 │        549.96 / 556.53 ±4.53 / 561.74 ms │        555.32 / 563.21 ±8.57 / 576.93 ms │     no change │
│ QQuery 75 │        276.31 / 278.53 ±1.95 / 281.63 ms │        275.59 / 280.25 ±2.79 / 283.29 ms │     no change │
│ QQuery 76 │        133.97 / 135.19 ±1.45 / 138.01 ms │        134.75 / 135.34 ±0.67 / 136.56 ms │     no change │
│ QQuery 77 │        185.93 / 189.10 ±2.00 / 191.68 ms │        191.16 / 191.75 ±0.38 / 192.05 ms │     no change │
│ QQuery 78 │        351.01 / 352.90 ±2.18 / 356.25 ms │        354.79 / 358.47 ±2.57 / 361.92 ms │     no change │
│ QQuery 79 │        230.33 / 234.65 ±4.26 / 240.00 ms │        234.45 / 237.20 ±3.03 / 242.81 ms │     no change │
│ QQuery 80 │        330.32 / 333.77 ±2.04 / 335.85 ms │        331.71 / 335.27 ±1.99 / 337.79 ms │     no change │
│ QQuery 81 │           26.55 / 27.10 ±0.49 / 27.84 ms │           26.42 / 27.98 ±1.49 / 30.13 ms │     no change │
│ QQuery 82 │        199.18 / 201.17 ±1.34 / 202.82 ms │        201.61 / 205.28 ±2.70 / 208.85 ms │     no change │
│ QQuery 83 │           38.69 / 40.41 ±1.58 / 42.97 ms │           40.01 / 41.13 ±1.09 / 43.20 ms │     no change │
│ QQuery 84 │           49.51 / 50.17 ±0.45 / 50.71 ms │           48.07 / 50.51 ±1.98 / 54.08 ms │     no change │
│ QQuery 85 │        147.71 / 150.31 ±2.00 / 153.14 ms │        150.11 / 151.04 ±1.39 / 153.77 ms │     no change │
│ QQuery 86 │           38.61 / 39.85 ±1.16 / 41.74 ms │           38.90 / 40.01 ±1.10 / 41.94 ms │     no change │
│ QQuery 87 │           85.62 / 89.43 ±3.72 / 96.14 ms │           88.62 / 91.05 ±1.81 / 93.18 ms │     no change │
│ QQuery 88 │        100.49 / 101.33 ±0.55 / 102.04 ms │        102.18 / 103.60 ±1.25 / 105.88 ms │     no change │
│ QQuery 89 │        118.53 / 119.33 ±0.57 / 120.20 ms │        117.01 / 119.80 ±1.72 / 122.41 ms │     no change │
│ QQuery 90 │           23.82 / 24.92 ±1.12 / 26.83 ms │           23.91 / 24.93 ±1.14 / 27.08 ms │     no change │
│ QQuery 91 │           62.58 / 64.14 ±1.01 / 65.41 ms │           61.43 / 64.97 ±1.95 / 67.12 ms │     no change │
│ QQuery 92 │           56.88 / 58.18 ±1.28 / 60.06 ms │           56.96 / 58.78 ±1.42 / 61.32 ms │     no change │
│ QQuery 93 │        191.67 / 193.63 ±1.46 / 195.60 ms │        191.97 / 195.01 ±1.75 / 197.34 ms │     no change │
│ QQuery 94 │           61.48 / 62.39 ±0.66 / 63.23 ms │           62.26 / 63.94 ±1.19 / 65.26 ms │     no change │
│ QQuery 95 │        135.24 / 136.01 ±0.58 / 136.80 ms │        136.14 / 138.48 ±1.50 / 140.47 ms │     no change │
│ QQuery 96 │           72.21 / 74.70 ±2.03 / 78.25 ms │           75.20 / 76.70 ±0.93 / 78.01 ms │     no change │
│ QQuery 97 │        127.84 / 131.72 ±2.24 / 134.68 ms │        130.92 / 133.13 ±1.53 / 135.40 ms │     no change │
│ QQuery 98 │        152.50 / 154.94 ±1.90 / 157.73 ms │        153.70 / 156.80 ±1.90 / 158.83 ms │     no change │
│ QQuery 99 │ 10732.85 / 10766.55 ±18.72 / 10789.45 ms │ 10874.79 / 10909.05 ±18.77 / 10930.81 ms │     no change │
└───────────┴──────────────────────────────────────────┴──────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                  │ 33880.49ms │
│ Total Time (feat_sort-file-groups-by-statistics)   │ 33909.12ms │
│ Average Time (HEAD)                                │   342.23ms │
│ Average Time (feat_sort-file-groups-by-statistics) │   342.52ms │
│ Queries Faster                                     │          1 │
│ Queries Slower                                     │          0 │
│ Queries with No Change                             │         98 │
│ Queries with Failure                               │          0 │
└────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 169.7s
Peak memory 5.7 GiB
Avg memory 4.6 GiB
CPU user 272.0s
CPU sys 18.9s
Disk read 0 B
Disk write 707.3 MiB

tpcds — branch

Metric Value
Wall time 169.8s
Peak memory 5.5 GiB
Avg memory 4.4 GiB
CPU user 271.0s
CPU sys 19.1s
Disk read 0 B
Disk write 776.0 KiB

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and feat_sort-file-groups-by-statistics
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃   feat_sort-file-groups-by-statistics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.35 / 4.67 ±6.49 / 17.65 ms │          1.31 / 4.54 ±6.32 / 17.18 ms │     no change │
│ QQuery 1  │        14.38 / 14.78 ±0.21 / 14.96 ms │        14.24 / 14.64 ±0.20 / 14.78 ms │     no change │
│ QQuery 2  │        43.97 / 44.30 ±0.19 / 44.49 ms │        44.89 / 45.24 ±0.26 / 45.58 ms │     no change │
│ QQuery 3  │        43.63 / 44.81 ±0.99 / 46.48 ms │        43.91 / 44.41 ±0.62 / 45.62 ms │     no change │
│ QQuery 4  │     294.40 / 298.97 ±3.74 / 305.25 ms │     290.63 / 298.92 ±7.74 / 313.18 ms │     no change │
│ QQuery 5  │     352.42 / 357.91 ±5.25 / 366.00 ms │     344.57 / 348.65 ±2.69 / 351.92 ms │     no change │
│ QQuery 6  │           5.53 / 7.01 ±1.39 / 9.24 ms │           4.88 / 6.43 ±1.10 / 8.20 ms │ +1.09x faster │
│ QQuery 7  │        17.04 / 17.20 ±0.15 / 17.47 ms │        16.83 / 17.78 ±1.07 / 19.72 ms │     no change │
│ QQuery 8  │     425.90 / 436.12 ±6.63 / 443.24 ms │     422.01 / 432.82 ±6.91 / 442.22 ms │     no change │
│ QQuery 9  │    663.35 / 680.08 ±15.03 / 705.76 ms │     652.38 / 656.82 ±5.48 / 667.28 ms │     no change │
│ QQuery 10 │        92.26 / 96.14 ±2.91 / 99.68 ms │       91.60 / 95.74 ±3.62 / 101.46 ms │     no change │
│ QQuery 11 │     104.19 / 107.29 ±2.15 / 110.94 ms │     104.41 / 105.28 ±0.95 / 106.89 ms │     no change │
│ QQuery 12 │     351.75 / 356.30 ±3.53 / 359.94 ms │     345.87 / 353.56 ±5.43 / 361.43 ms │     no change │
│ QQuery 13 │    486.97 / 512.17 ±27.14 / 555.93 ms │    461.81 / 481.11 ±12.80 / 499.00 ms │ +1.06x faster │
│ QQuery 14 │     352.77 / 358.48 ±5.57 / 367.99 ms │     353.41 / 361.30 ±5.06 / 367.41 ms │     no change │
│ QQuery 15 │    370.93 / 408.33 ±20.31 / 428.45 ms │    364.80 / 381.27 ±12.26 / 396.42 ms │ +1.07x faster │
│ QQuery 16 │    732.23 / 768.55 ±40.37 / 832.55 ms │    729.00 / 751.30 ±14.35 / 768.89 ms │     no change │
│ QQuery 17 │    718.26 / 727.93 ±11.23 / 749.89 ms │     718.67 / 725.62 ±5.29 / 732.67 ms │     no change │
│ QQuery 18 │ 1380.34 / 1431.09 ±51.53 / 1521.24 ms │ 1472.19 / 1496.11 ±28.04 / 1549.89 ms │     no change │
│ QQuery 19 │        37.27 / 38.26 ±1.18 / 39.92 ms │        36.69 / 38.30 ±0.97 / 39.72 ms │     no change │
│ QQuery 20 │    701.66 / 718.37 ±17.71 / 746.84 ms │    715.84 / 726.31 ±12.61 / 749.81 ms │     no change │
│ QQuery 21 │     751.75 / 754.26 ±2.95 / 759.78 ms │     760.74 / 762.76 ±1.61 / 765.19 ms │     no change │
│ QQuery 22 │  1121.24 / 1131.92 ±9.38 / 1147.90 ms │  1132.79 / 1137.91 ±3.11 / 1141.01 ms │     no change │
│ QQuery 23 │ 3088.40 / 3105.46 ±12.82 / 3121.82 ms │  3088.31 / 3098.41 ±7.23 / 3110.47 ms │     no change │
│ QQuery 24 │     100.47 / 103.83 ±2.04 / 105.88 ms │      96.57 / 100.78 ±2.64 / 104.15 ms │     no change │
│ QQuery 25 │     140.42 / 141.71 ±1.07 / 142.86 ms │     136.43 / 138.60 ±1.87 / 142.01 ms │     no change │
│ QQuery 26 │      98.76 / 102.77 ±3.33 / 107.40 ms │     100.86 / 103.39 ±2.21 / 105.89 ms │     no change │
│ QQuery 27 │    841.69 / 854.95 ±13.92 / 879.72 ms │     848.63 / 855.26 ±5.22 / 861.60 ms │     no change │
│ QQuery 28 │ 7723.29 / 7755.32 ±24.29 / 7795.51 ms │ 7718.14 / 7794.40 ±40.10 / 7832.70 ms │     no change │
│ QQuery 29 │        50.45 / 54.30 ±3.51 / 60.90 ms │        50.47 / 56.85 ±7.54 / 71.51 ms │     no change │
│ QQuery 30 │     357.85 / 363.89 ±4.15 / 370.30 ms │     364.53 / 371.59 ±3.85 / 374.57 ms │     no change │
│ QQuery 31 │    375.85 / 390.40 ±13.93 / 416.33 ms │     370.93 / 382.94 ±7.22 / 392.45 ms │     no change │
│ QQuery 32 │ 1031.92 / 1059.40 ±28.32 / 1111.91 ms │ 1049.04 / 1079.45 ±30.37 / 1128.60 ms │     no change │
│ QQuery 33 │  1456.64 / 1463.45 ±5.34 / 1471.80 ms │ 1489.84 / 1510.22 ±20.02 / 1547.16 ms │     no change │
│ QQuery 34 │ 1457.74 / 1479.70 ±14.62 / 1497.78 ms │  1496.16 / 1505.29 ±7.77 / 1519.55 ms │     no change │
│ QQuery 35 │     384.08 / 392.72 ±8.10 / 405.96 ms │     400.53 / 409.65 ±6.87 / 417.46 ms │     no change │
│ QQuery 36 │     111.95 / 120.51 ±4.57 / 124.52 ms │     116.30 / 123.86 ±4.86 / 130.86 ms │     no change │
│ QQuery 37 │        49.42 / 50.86 ±1.41 / 53.44 ms │        48.54 / 51.28 ±1.92 / 54.06 ms │     no change │
│ QQuery 38 │        75.44 / 77.73 ±1.87 / 79.67 ms │        77.94 / 79.70 ±1.82 / 82.62 ms │     no change │
│ QQuery 39 │     208.41 / 218.77 ±8.09 / 230.52 ms │     216.98 / 226.86 ±6.30 / 236.52 ms │     no change │
│ QQuery 40 │        23.66 / 26.32 ±2.19 / 29.99 ms │        25.32 / 27.64 ±1.52 / 29.93 ms │  1.05x slower │
│ QQuery 41 │        19.78 / 21.12 ±0.91 / 22.40 ms │        19.89 / 21.84 ±1.11 / 23.34 ms │     no change │
│ QQuery 42 │        19.30 / 20.68 ±1.38 / 23.28 ms │        20.40 / 21.06 ±0.71 / 22.27 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                  │ 27118.84ms │
│ Total Time (feat_sort-file-groups-by-statistics)   │ 27245.93ms │
│ Average Time (HEAD)                                │   630.67ms │
│ Average Time (feat_sort-file-groups-by-statistics) │   633.63ms │
│ Queries Faster                                     │          3 │
│ Queries Slower                                     │          1 │
│ Queries with No Change                             │         39 │
│ Queries with Failure                               │          0 │
└────────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 136.8s
Peak memory 41.3 GiB
Avg memory 33.4 GiB
CPU user 1286.4s
CPU sys 91.0s
Disk read 0 B
Disk write 3.3 GiB

clickbench_partitioned — branch

Metric Value
Wall time 137.5s
Peak memory 45.1 GiB
Avg memory 32.3 GiB
CPU user 1289.5s
CPU sys 93.0s
Disk read 0 B
Disk write 748.0 KiB

File an issue against this benchmark runner

@zhuqi-lucas zhuqi-lucas force-pushed the feat/sort-file-groups-by-statistics branch from a79cbdf to 5e3eaac Compare March 27, 2026 15:33
@zhuqi-lucas
Copy link
Copy Markdown
Contributor Author

zhuqi-lucas commented Mar 27, 2026

Thanks for the review @adriangb! I've addressed all feedback:

  • Benchmark split: Removed benchmark code from this PR, will submit as a follow-up PR after this PR done
  • "Unsupported if already in order": Clarified the comment — expanded to explain that Unsupported means no change was made to the plan (files were already in the correct order, nothing to optimize)
  • Row-group-level stats reordering: Great idea for a follow-up — same concept as file-level reordering but at finer granularity, especially powerful with morselized scans
  • Lopsided partitions: Minimal impact in practice since file count >> partition count, imbalance is at most 1 extra file

@adriangb
Copy link
Copy Markdown
Contributor

Benchmark split: Removed benchmark code from this PR, will submit as a follow-up PR after this PR done

I'd like to do the opposite: commit the benchmarks and SLT tests as a precursor PR, then rebase this branch so we can see just the diff in benchmarks / SLT.

@zhuqi-lucas
Copy link
Copy Markdown
Contributor Author

Benchmark split: Removed benchmark code from this PR, will submit as a follow-up PR after this PR done

I'd like to do the opposite: commit the benchmarks and SLT tests as a precursor PR, then rebase this branch so we can see just the diff in benchmarks / SLT.

Good point, i agree.

zhuqi-lucas and others added 6 commits March 27, 2026 23:53
Sort files within each file group by min/max statistics during sort
pushdown to better align with the requested ordering. When files are
non-overlapping and within-file ordering is guaranteed (e.g. Parquet
with sorting_columns metadata), the SortExec is completely eliminated.

Key changes:
- ParquetSource::try_pushdown_sort returns Exact when natural ordering
  satisfies the request, enabling sort elimination
- FileScanConfig sorts files within groups by statistics and verifies
  non-overlapping property to determine Exact vs Inexact
- Multi-group files are redistributed consecutively to preserve both
  sort elimination and I/O parallelism across partitions
- Statistics-based file reordering as fallback when FileSource returns
  Unsupported (benefits TopK via better dynamic filter pruning)
- New sort_pushdown benchmark for measuring sort elimination speedup

Closes apache#17348

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… improve docs

- Remove dead stats computation in reverse_file_groups branch
  (reverse path is always Inexact, so all_non_overlapping is unused)
- Add reverse prefix matching documentation to pushdown_sort module

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ring

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The existing doc comment explains that multi-file partitions break
output ordering. Add a note about the exception: when sort pushdown
verifies files are non-overlapping via statistics, output_ordering
is preserved and SortExec can be eliminated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… comment

- Remove benchmark code (sort_pushdown.rs, bench.sh, dfbench.rs changes)
  to be submitted as a separate follow-up PR per reviewer request
- Clarify "Unsupported if already in order" in architecture diagram:
  explain that Unsupported means no change was made to the plan

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zhuqi-lucas zhuqi-lucas force-pushed the feat/sort-file-groups-by-statistics branch from 5e3eaac to f9de9be Compare March 27, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate datasource Changes to the datasource crate optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

4 participants