8386911: Crypto benchmark regressions after JDK-8384353#31648
8386911: Crypto benchmark regressions after JDK-8384353#31648vpaprotsk wants to merge 8 commits into
Conversation
|
👋 Welcome back vpaprotski! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@vpaprotsk The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
| throw new InvalidAlgorithmParameterException("Bad parallel parameter."); | ||
| } | ||
|
|
||
| for (int i = 0; i < NRPAR; i++) { |
There was a problem hiding this comment.
nr iterations is enough here.
|
I ran ML-KEM and ML-DSA benchmarks on a Macbook Pro M3 36GB Memory, see attached. Everything is improved with the set of changes except for ML-KEM-512 (encapsulation:-0.92%/decapsulation:-2.25%) and ML-DSA-44 (key generation:-1.03%). |
Reran the specific benchmarks that showed regression on a quieter system (MacbookAir M3 8GB Memory) and averaged the results, which showed negligible differences of the afflicted benchmarks as follows: ML-KEM-512 (encapsulation:-0.38%/decapsulation:-0.21%) and ML-DSA-44 (key generation: -0.18%). In other words, the current fix resolves the regression in performance by either negligible differences or better performance, where the high observed +6.55% (ML-KEM-768, which would be one of the algorithms that would gain the most from this delta). |
With
quadKeccak, it is possible to have 3 extrakeccaks calls that are really noops.. teachSHA3Parallel.squeeze()how many operations it really should be doing.While it is possible to match precisely the number of keccak calls required.. the previous implementation always called doubleKeccak, so collapsing the odd numbers to the closest even.
PS:
make install-hsdis test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics;FORK=1;ITER=3;TIME=10;WARMUP_ITER=7;WARMUP_TIME=10;OPTIONS=-prof perfasm -p algorithm=ML-DSA-65"Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/31648/head:pull/31648$ git checkout pull/31648Update a local copy of the PR:
$ git checkout pull/31648$ git pull https://git.openjdk.org/jdk.git pull/31648/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 31648View PR using the GUI difftool:
$ git pr show -t 31648Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/31648.diff
Using Webrev
Link to Webrev Comment