Skip to content

shuf: Use rustc-hash for performance#10648

Merged
sylvestre merged 1 commit intouutils:mainfrom
oech3:shufxxh3
Feb 2, 2026
Merged

shuf: Use rustc-hash for performance#10648
sylvestre merged 1 commit intouutils:mainfrom
oech3:shufxxh3

Conversation

@oech3
Copy link
Contributor

@oech3 oech3 commented Feb 2, 2026

No description provided.

@oech3 oech3 force-pushed the shufxxh3 branch 2 times, most recently from 3b92963 to bc3c780 Compare February 2, 2026 06:40
@codspeed-hq
Copy link

codspeed-hq bot commented Feb 2, 2026

CodSpeed Performance Report

Merging this PR will improve performance by 31.56%

Comparing oech3:shufxxh3 (f856a46) with main (61da637)

Summary

⚡ 1 improved benchmark
✅ 141 untouched benchmarks
⏩ 180 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation shuf_input_range[1000000] 126.4 ms 96.1 ms +31.56%

Footnotes

  1. 180 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@oech3 oech3 force-pushed the shufxxh3 branch 3 times, most recently from 672c409 to eee274c Compare February 2, 2026 06:55
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)

@oech3 oech3 marked this pull request as ready for review February 2, 2026 07:31
@oech3 oech3 changed the title shuf: Use xxh3 for performanice shuf: Use rustc-hash for performanice Feb 2, 2026
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

please fix the typo in the commit message
'performance'

@sylvestre sylvestre changed the title shuf: Use rustc-hash for performanice shuf: Use rustc-hash for performance Feb 2, 2026
@oech3
Copy link
Contributor Author

oech3 commented Feb 2, 2026

Fixed. Sorry...

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)

@sylvestre sylvestre merged commit daf8fb5 into uutils:main Feb 2, 2026
130 checks passed
@oech3 oech3 deleted the shufxxh3 branch February 2, 2026 12:47
@ChrisDryden
Copy link
Collaborator

IIRC the shuf/shuf-reservoir gnu test is failing because the implementation was slower than the GNU test and would sometimes timeout on high cpu load, would be interesting to see if this removed the test flakiness for that test.

@oech3
Copy link
Contributor Author

oech3 commented Feb 2, 2026

It does not run shuf -i 1-HUGE_NUM.

@oech3
Copy link
Contributor Author

oech3 commented Feb 2, 2026

Do you think that we should change hash algo for (t)sort to xxh3 or other?

@xtqqczze
Copy link
Contributor

xtqqczze commented Feb 2, 2026

rustc-hash is a "non-cryptographic hashing algorithm". Is resistance against collisions a concern for shuf?

@oech3
Copy link
Contributor Author

oech3 commented Feb 2, 2026

Is HashMap used for file names directly? It might able to fallback to default hash func for files on network.

@xtqqczze

This comment was marked as outdated.

@oech3
Copy link
Contributor Author

oech3 commented Feb 2, 2026

not SipHash?

@xtqqczze

This comment was marked as outdated.

@xtqqczze
Copy link
Contributor

xtqqczze commented Feb 2, 2026

not SipHash?

Yes the standard library does in fact use SipHasher13, the documentation for hashbrown is misleading ( rust-lang/hashbrown#153)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants