Skip to content

RFC: Add incremental encaps API to support ML-KEM Braid#1619

Draft
mkannwischer wants to merge 12 commits intomainfrom
incremental-enc-api
Draft

RFC: Add incremental encaps API to support ML-KEM Braid#1619
mkannwischer wants to merge 12 commits intomainfrom
incremental-enc-api

Conversation

@mkannwischer
Copy link
Contributor

Split ML-KEM encapsulation into two phases (mlk_kem_enc_derand_u / mlk_kem_enc_v) to support protocols like Braid that need to interleave encapsulation with other operations between computing the u- and v-components of the ciphertext. The first phase only requires the public seed and H(pk), not the full public key vector. Internally, K-PKE.Encrypt is refactored into mlk_indcpa_enc_u + mlk_indcpa_enc_v. The non-incremental KEM path calls mlk_indcpa_enc directly to avoid serialization overhead. The intermediate noise polynomial epp is serialized as 4-bit nibbles (128 bytes) - this is primarily done to not require a pre-condition on the allowed values.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 325ab51 to 285fc8a Compare March 12, 2026 05:37
@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label Mar 12, 2026
Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 11989 cycles 11971 cycles 1.00
ML-KEM-512 encaps 13728 cycles 13745 cycles 1.00
ML-KEM-512 decaps 17966 cycles 17771 cycles 1.01
ML-KEM-768 keypair 21321 cycles 21010 cycles 1.01
ML-KEM-768 encaps 22292 cycles 22095 cycles 1.01
ML-KEM-768 decaps 28060 cycles 28300 cycles 0.99
ML-KEM-1024 keypair 29955 cycles 29866 cycles 1.00
ML-KEM-1024 encaps 31948 cycles 31758 cycles 1.01
ML-KEM-1024 decaps 40366 cycles 39591 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ppc64le (POWER10) benchmarks

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 59214 cycles 60047 cycles 0.99
ML-KEM-512 encaps 72322 cycles 72930 cycles 0.99
ML-KEM-512 decaps 91754 cycles 92987 cycles 0.99
ML-KEM-768 keypair 98445 cycles 98984 cycles 0.99
ML-KEM-768 encaps 115372 cycles 115469 cycles 1.00
ML-KEM-768 decaps 141100 cycles 141322 cycles 1.00
ML-KEM-1024 keypair 148781 cycles 149075 cycles 1.00
ML-KEM-1024 encaps 167999 cycles 167651 cycles 1.00
ML-KEM-1024 decaps 199548 cycles 198842 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 14284 cycles 14291 cycles 1.00
ML-KEM-512 encaps 16843 cycles 16019 cycles 1.05
ML-KEM-512 decaps 21421 cycles 21507 cycles 1.00
ML-KEM-768 keypair 24344 cycles 24715 cycles 0.98
ML-KEM-768 encaps 25386 cycles 25491 cycles 1.00
ML-KEM-768 decaps 34652 cycles 33275 cycles 1.04
ML-KEM-1024 keypair 37236 cycles 37264 cycles 1.00
ML-KEM-1024 encaps 36923 cycles 36892 cycles 1.00
ML-KEM-1024 decaps 49514 cycles 46772 cycles 1.06

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 encaps 16843 cycles 16019 cycles 1.05
ML-KEM-768 decaps 34652 cycles 33275 cycles 1.04
ML-KEM-1024 decaps 49514 cycles 46772 cycles 1.06

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 28980 cycles 28131 cycles 1.03
ML-KEM-512 encaps 35927 cycles 36568 cycles 0.98
ML-KEM-512 decaps 45329 cycles 45119 cycles 1.00
ML-KEM-768 keypair 46748 cycles 46297 cycles 1.01
ML-KEM-768 encaps 55800 cycles 55658 cycles 1.00
ML-KEM-768 decaps 69061 cycles 69941 cycles 0.99
ML-KEM-1024 keypair 70953 cycles 70194 cycles 1.01
ML-KEM-1024 encaps 84236 cycles 82475 cycles 1.02
ML-KEM-1024 decaps 99959 cycles 98796 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 13155 cycles 12768 cycles 1.03
ML-KEM-512 encaps 15653 cycles 14254 cycles 1.10
ML-KEM-512 decaps 19144 cycles 19131 cycles 1.00
ML-KEM-768 keypair 22752 cycles 22414 cycles 1.02
ML-KEM-768 encaps 23130 cycles 23041 cycles 1.00
ML-KEM-768 decaps 30221 cycles 30089 cycles 1.00
ML-KEM-1024 keypair 34312 cycles 33024 cycles 1.04
ML-KEM-1024 encaps 33636 cycles 33006 cycles 1.02
ML-KEM-1024 decaps 49127 cycles 42430 cycles 1.16

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 13155 cycles 12768 cycles 1.03
ML-KEM-512 encaps 15653 cycles 14254 cycles 1.10
ML-KEM-1024 keypair 34312 cycles 33024 cycles 1.04
ML-KEM-1024 decaps 49127 cycles 42430 cycles 1.16

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 17486 cycles 17485 cycles 1.00
ML-KEM-512 encaps 20539 cycles 19875 cycles 1.03
ML-KEM-512 decaps 27163 cycles 26415 cycles 1.03
ML-KEM-768 keypair 31001 cycles 31874 cycles 0.97
ML-KEM-768 encaps 32072 cycles 31109 cycles 1.03
ML-KEM-768 decaps 42651 cycles 41545 cycles 1.03
ML-KEM-1024 keypair 44383 cycles 46137 cycles 0.96
ML-KEM-1024 encaps 45973 cycles 45137 cycles 1.02
ML-KEM-1024 decaps 61116 cycles 58253 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 encaps 20539 cycles 19875 cycles 1.03
ML-KEM-768 encaps 32072 cycles 31109 cycles 1.03
ML-KEM-1024 decaps 61116 cycles 58253 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 40146 cycles 40253 cycles 1.00
ML-KEM-512 encaps 48727 cycles 48411 cycles 1.01
ML-KEM-512 decaps 62931 cycles 62600 cycles 1.01
ML-KEM-768 keypair 64509 cycles 63756 cycles 1.01
ML-KEM-768 encaps 75410 cycles 74947 cycles 1.01
ML-KEM-768 decaps 93707 cycles 93618 cycles 1.00
ML-KEM-1024 keypair 95205 cycles 94982 cycles 1.00
ML-KEM-1024 encaps 109859 cycles 109167 cycles 1.01
ML-KEM-1024 decaps 132630 cycles 131931 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 36601 cycles 36584 cycles 1.00
ML-KEM-512 encaps 43203 cycles 43043 cycles 1.00
ML-KEM-512 decaps 55847 cycles 55694 cycles 1.00
ML-KEM-768 keypair 58695 cycles 58618 cycles 1.00
ML-KEM-768 encaps 67803 cycles 67618 cycles 1.00
ML-KEM-768 decaps 84624 cycles 84427 cycles 1.00
ML-KEM-1024 keypair 89078 cycles 88963 cycles 1.00
ML-KEM-1024 encaps 99677 cycles 99133 cycles 1.01
ML-KEM-1024 decaps 121227 cycles 120720 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 28272 cycles 28233 cycles 1.00
ML-KEM-512 encaps 34307 cycles 34105 cycles 1.01
ML-KEM-512 decaps 44554 cycles 44329 cycles 1.01
ML-KEM-768 keypair 47644 cycles 47627 cycles 1.00
ML-KEM-768 encaps 54197 cycles 53956 cycles 1.00
ML-KEM-768 decaps 68663 cycles 68377 cycles 1.00
ML-KEM-1024 keypair 70252 cycles 70255 cycles 1.00
ML-KEM-1024 encaps 79150 cycles 78806 cycles 1.00
ML-KEM-1024 decaps 98773 cycles 98442 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 17678 cycles 17650 cycles 1.00
ML-KEM-512 encaps 20697 cycles 20601 cycles 1.00
ML-KEM-512 decaps 27162 cycles 27068 cycles 1.00
ML-KEM-768 keypair 29932 cycles 29899 cycles 1.00
ML-KEM-768 encaps 32943 cycles 32776 cycles 1.01
ML-KEM-768 decaps 42137 cycles 41967 cycles 1.00
ML-KEM-1024 keypair 43751 cycles 43750 cycles 1.00
ML-KEM-1024 encaps 48899 cycles 48727 cycles 1.00
ML-KEM-1024 decaps 61543 cycles 61390 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 46126 cycles 45656 cycles 1.01
ML-KEM-512 encaps 55040 cycles 54451 cycles 1.01
ML-KEM-512 decaps 70160 cycles 69753 cycles 1.01
ML-KEM-768 keypair 73388 cycles 74171 cycles 0.99
ML-KEM-768 encaps 86586 cycles 85948 cycles 1.01
ML-KEM-768 decaps 106701 cycles 106520 cycles 1.00
ML-KEM-1024 keypair 111782 cycles 112098 cycles 1.00
ML-KEM-1024 encaps 126133 cycles 124601 cycles 1.01
ML-KEM-1024 decaps 151974 cycles 150531 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 35450 cycles 35416 cycles 1.00
ML-KEM-512 encaps 41585 cycles 40122 cycles 1.04
ML-KEM-512 decaps 51571 cycles 51145 cycles 1.01
ML-KEM-768 keypair 56738 cycles 56670 cycles 1.00
ML-KEM-768 encaps 65173 cycles 65151 cycles 1.00
ML-KEM-768 decaps 79386 cycles 79295 cycles 1.00
ML-KEM-1024 keypair 88008 cycles 87866 cycles 1.00
ML-KEM-1024 encaps 97739 cycles 96879 cycles 1.01
ML-KEM-1024 decaps 116607 cycles 115822 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 18670 cycles 18639 cycles 1.00
ML-KEM-512 encaps 21918 cycles 21876 cycles 1.00
ML-KEM-512 decaps 28893 cycles 28861 cycles 1.00
ML-KEM-768 keypair 31593 cycles 31540 cycles 1.00
ML-KEM-768 encaps 34945 cycles 34771 cycles 1.01
ML-KEM-768 decaps 44893 cycles 44775 cycles 1.00
ML-KEM-1024 keypair 46070 cycles 46082 cycles 1.00
ML-KEM-1024 encaps 51716 cycles 51501 cycles 1.00
ML-KEM-1024 decaps 65317 cycles 65036 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 28281 cycles 28256 cycles 1.00
ML-KEM-512 encaps 34335 cycles 34110 cycles 1.01
ML-KEM-512 decaps 44593 cycles 44398 cycles 1.00
ML-KEM-768 keypair 47645 cycles 47665 cycles 1.00
ML-KEM-768 encaps 54337 cycles 53940 cycles 1.01
ML-KEM-768 decaps 68766 cycles 68363 cycles 1.01
ML-KEM-1024 keypair 70355 cycles 70328 cycles 1.00
ML-KEM-1024 encaps 79143 cycles 78757 cycles 1.00
ML-KEM-1024 decaps 98854 cycles 98529 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 38936 cycles 38887 cycles 1.00
ML-KEM-512 encaps 47075 cycles 44596 cycles 1.06
ML-KEM-512 decaps 57101 cycles 56673 cycles 1.01
ML-KEM-768 keypair 62374 cycles 62294 cycles 1.00
ML-KEM-768 encaps 71641 cycles 72330 cycles 0.99
ML-KEM-768 decaps 87319 cycles 87696 cycles 1.00
ML-KEM-1024 keypair 96361 cycles 96160 cycles 1.00
ML-KEM-1024 encaps 107186 cycles 106135 cycles 1.01
ML-KEM-1024 decaps 127584 cycles 126583 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 59167 cycles 59049 cycles 1.00
ML-KEM-512 encaps 69657 cycles 68578 cycles 1.02
ML-KEM-512 decaps 87765 cycles 87314 cycles 1.01
ML-KEM-768 keypair 95627 cycles 95479 cycles 1.00
ML-KEM-768 encaps 111277 cycles 109908 cycles 1.01
ML-KEM-768 decaps 135008 cycles 134361 cycles 1.00
ML-KEM-1024 keypair 147782 cycles 147876 cycles 1.00
ML-KEM-1024 encaps 164356 cycles 163805 cycles 1.00
ML-KEM-1024 decaps 196183 cycles 195456 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot
Copy link
Contributor

oqs-bot commented Mar 12, 2026

CBMC Results (ML-KEM-512)

⚠️ Attention Required

Proof Status Current Previous Change
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 152s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
mlk_indcpa_keypair_derand ⚠️ 411s 99s +315%
Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1424s 1186s +20.1%
mlk_indcpa_keypair_derand ⚠️ 411s 99s +315%
mlk_keccak_squeezeblocks_x4 137s 129s +6%
mlk_polyvec_basemul_acc_montgomery_cached_c 74s 64s +16%
mlk_rej_uniform_c 68s 79s -14%
mlk_poly_rej_uniform 42s 38s +11%
poly_ntt_native 36s 29s +24%
polyvec_basemul_acc_montgomery_cached_native 22s 19s +16%
keccakf1600x4_permute_native_x4 21s 19s +11%
mlk_polyvec_add 20s 17s +18%
mlk_poly_reduce_native 17s 14s +21%
mlk_ntt_layer 14s 13s +8%
mlk_poly_decompress_d10_native 14s 12s +17%
mlk_poly_decompress_d4_native 14s 11s +27%
mlk_indcpa_dec 10s 11s -9%
mlk_poly_frommsg 10s 8s +25%
mlk_polymat_permute_bitrev_to_custom 10s 8s +25%
mlk_fqmul 9s 7s +29%
mlk_ntt_butterfly_block 9s 8s +12%
mlk_poly_frombytes_native 9s 8s +12%
mlk_poly_rej_uniform_x4 8s 8s +0%
keccakf1600_permute_native 7s 8s -12%
mlk_invntt_layer 7s 4s +75%
mlk_keccak_absorb_once_x4 7s 8s -12%
mlk_keccak_squeeze_once 7s 6s +17%
mlk_keccak_squeezeblocks 7s 5s +40%
mlk_poly_mulcache_compute_c 7s 3s +133%
rej_uniform_native 7s 4s +75%
mlk_check_pct 6s 2s +200%
mlk_polyvec_frombytes 6s 1s +500%
poly_frombytes_native_x86_64 6s 4s +50%
kem_enc_derand 5s 4s +25%
mlk_keccak_absorb_once 5s 6s -17%
mlk_poly_add 5s 5s +0%
mlk_poly_cbd_eta2 5s 4s +25%
mlk_poly_decompress_d10_c 5s 5s +0%
mlk_poly_invntt_tomont_c 5s 2s +150%
mlk_scalar_compress_d1 5s 3s +67%
poly_getnoise_eta1122_4x_native 5s 2s +150%
poly_reduce_native_x86_64 5s 3s +67%
poly_tobytes_native_x86_64 5s 2s +150%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 5s 4s +25%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 5s 2s +150%
kem_check_pk 4s 4s +0%
kem_dec 4s 5s -20%
kem_keypair_derand 4s 3s +33%
mlk_poly_compress_d10_c 4s 3s +33%
mlk_poly_compress_d11 4s 1s +300%
mlk_poly_getnoise_eta1_4x 4s 3s +33%
mlk_poly_getnoise_eta1_4x_native 4s 3s +33%
mlk_poly_mulcache_compute 4s 2s +100%
mlk_poly_mulcache_compute_native 4s 1s +300%
mlk_poly_reduce 4s 3s +33%
mlk_poly_tomont 4s 1s +300%
mlk_polyvec_reduce 4s 3s +33%
mlk_polyvec_tobytes 4s 1s +300%
mlk_scalar_compress_d11 4s 2s +100%
mlk_scalar_decompress_d11 4s 2s +100%
poly_compress_d10_native_x86_64 4s 3s +33%
poly_compress_d11_native_x86_64 4s 2s +100%
poly_compress_d4_native_x86_64 4s 3s +33%
poly_decompress_d10_native_x86_64 4s 5s -20%
poly_decompress_d4_native_x86_64 4s 5s -20%
poly_decompress_d5_native_x86_64 4s 2s +100%
poly_tobytes_native_aarch64 4s 2s +100%
rej_uniform_native_aarch64 4s 3s +33%
keccak_f1600_x1_native_aarch64 3s 1s +200%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccakf1600x4_xor_bytes_native 3s 2s +50%
mlk_barrett_reduce 3s 2s +50%
mlk_gen_matrix 3s 3s +0%
mlk_keccakf1600_extract_bytes (big endian) 3s 3s +0%
mlk_matvec_mul 3s 2s +50%
mlk_poly_compress_d10_native 3s 2s +50%
mlk_poly_compress_d4 3s 3s +0%
mlk_poly_decompress_d10 3s 3s +0%
mlk_poly_decompress_d11 3s 2s +50%
mlk_poly_decompress_d5 3s 2s +50%
mlk_poly_decompress_d5_native 3s 1s +200%
mlk_poly_frombytes_c 3s 4s -25%
mlk_poly_getnoise_eta1122_4x 3s 2s +50%
mlk_poly_ntt_c 3s 1s +200%
mlk_poly_tobytes_c 3s 3s +0%
mlk_poly_tobytes_native 3s 4s -25%
mlk_polyvec_basemul_acc_montgomery_cached 3s 2s +50%
mlk_polyvec_permute_bitrev_to_custom_native 3s 4s -25%
mlk_scalar_decompress_d5 3s 4s -25%
mlk_sha3_512 3s 2s +50%
mlk_shake128x4_squeezeblocks 3s 1s +200%
mlk_shake256 3s 3s +0%
mlk_shake256x4 3s 5s -40%
nttunpack_native_x86_64 3s 4s -25%
poly_compress_d5_native_x86_64 3s 1s +200%
poly_mulcache_compute_native_aarch64 3s 4s -25%
poly_mulcache_compute_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 3s 4s -25%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
sys_check_capability 3s 1s +200%
intt_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 3s -33%
kem_check_sk 2s 2s +0%
kem_keypair 2s 4s -50%
mlk_ct_cmask_nonzero_u8 2s 3s -33%
mlk_ct_cmov_zero 2s 3s -33%
mlk_ct_get_optblocker_i32 2s 1s +100%
mlk_ct_get_optblocker_u32 2s 4s -50%
mlk_ct_get_optblocker_u8 2s 3s -33%
mlk_ct_sel_uint8 2s 1s +100%
mlk_gen_matrix_serial 2s 2s +0%
mlk_keccakf1600_permute 2s 4s -50%
mlk_keccakf1600_xor_bytes 2s 3s -33%
mlk_keccakf1600_xor_bytes (big endian) 2s 2s +0%
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_xor_bytes 2s 2s +0%
mlk_keypair_getnoise 2s 2s +0%
mlk_montgomery_reduce 2s 1s +100%
mlk_poly_cbd_eta1 2s 1s +100%
mlk_poly_compress_d10 2s 4s -50%
mlk_poly_compress_d11_native 2s 1s +100%
mlk_poly_compress_d4_c 2s 2s +0%
mlk_poly_compress_d5_c 2s 2s +0%
mlk_poly_compress_d5_native 2s 2s +0%
mlk_poly_compress_du 2s 4s -50%
mlk_poly_compress_dv 2s 1s +100%
mlk_poly_decompress_d11_c 2s 3s -33%
mlk_poly_decompress_d11_native 2s 1s +100%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_du 2s 2s +0%
mlk_poly_decompress_dv 2s 2s +0%
mlk_poly_frombytes 2s 2s +0%
mlk_poly_getnoise_eta2 2s 2s +0%
mlk_poly_invntt_tomont 2s 1s +100%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_tobytes 2s 3s -33%
mlk_poly_tomont_c 2s 3s -33%
mlk_poly_tomont_native 2s 2s +0%
mlk_poly_tomsg 2s 3s -33%
mlk_polyvec_decompress_du 2s 2s +0%
mlk_polyvec_invntt_tomont 2s 1s +100%
mlk_polyvec_mulcache_compute 2s 3s -33%
mlk_polyvec_ntt 2s 5s -60%
mlk_polyvec_permute_bitrev_to_custom 2s 2s +0%
mlk_rej_uniform 2s 1s +100%
mlk_scalar_compress_d10 2s 1s +100%
mlk_scalar_compress_d4 2s 1s +100%
mlk_scalar_compress_d5 2s 1s +100%
mlk_scalar_decompress_d10 2s 1s +100%
mlk_scalar_decompress_d4 2s 2s +0%
mlk_scalar_signed_to_unsigned_q 2s 2s +0%
mlk_sha3_256 2s 2s +0%
mlk_shake128_absorb_once 2s 3s -33%
mlk_shake128_squeezeblocks 2s 3s -33%
mlk_value_barrier_u8 2s 2s +0%
ntt_native_x86_64 2s 4s -50%
poly_decompress_d11_native_x86_64 2s 1s +100%
poly_reduce_native_aarch64 2s 1s +100%
poly_tomont_native_aarch64 2s 1s +100%
poly_tomont_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 3s -33%
rej_uniform_native_x86_64 2s 4s -50%
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 152s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
intt_native_x86_64 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 1s +0%
keccak_f1600_x4_native_avx2 1s 3s -67%
kem_enc 1s 2s -50%
mlk_ct_cmask_neg_i16 1s 1s +0%
mlk_ct_cmask_nonzero_u16 1s 1s +0%
mlk_ct_memcmp 1s 2s -50%
mlk_ct_sel_int16 1s 3s -67%
mlk_keccakf1600_extract_bytes 1s 3s -67%
mlk_keccakf1600x4_permute 1s 1s +0%
mlk_poly_compress_d11_c 1s 4s -75%
mlk_poly_compress_d4_native 1s 3s -67%
mlk_poly_compress_d5 1s 3s -67%
mlk_poly_decompress_d4_c 1s 3s -67%
mlk_poly_decompress_d5_c 1s 1s +0%
mlk_poly_ntt 1s 3s -67%
mlk_poly_sub 1s 3s -67%
mlk_polyvec_compress_du 1s 2s -50%
mlk_polyvec_tomont 1s 1s +0%
mlk_shake128x4_absorb_once 1s 2s -50%
mlk_value_barrier_i32 1s 4s -75%
mlk_value_barrier_u32 1s 2s -50%
ntt_native_aarch64 1s 4s -75%
poly_invntt_tomont_native 1s 4s -75%

@oqs-bot
Copy link
Contributor

oqs-bot commented Mar 12, 2026

CBMC Results (ML-KEM-768)

⚠️ Attention Required

Proof Status Current Previous Change
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 243s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1189s 1413s -15.9%
mlk_indcpa_keypair_derand 223s 231s -3%
mlk_keccak_squeezeblocks_x4 123s 118s +4%
mlk_rej_uniform_c 72s 73s -1%
polyvec_basemul_acc_montgomery_cached_native 58s 57s +2%
mlk_polyvec_basemul_acc_montgomery_cached_c 48s 46s +4%
mlk_poly_rej_uniform 36s 29s +24%
poly_ntt_native 27s 22s +23%
mlk_polyvec_add 26s 26s +0%
keccakf1600x4_permute_native_x4 20s 20s +0%
mlk_poly_decompress_d4_native 14s 12s +17%
mlk_poly_reduce_native 14s 14s +0%
mlk_indcpa_dec 13s 14s -7%
mlk_poly_decompress_d10_native 13s 11s +18%
mlk_ntt_layer 11s 14s -21%
mlk_poly_frommsg 10s 13s -23%
mlk_ntt_butterfly_block 8s 8s +0%
mlk_poly_rej_uniform_x4 8s 7s +14%
kem_dec 7s 5s +40%
mlk_keccak_absorb_once_x4 7s 9s -22%
mlk_poly_frombytes_native 7s 6s +17%
mlk_fqmul 6s 7s -14%
mlk_keccak_squeezeblocks 6s 6s +0%
mlk_poly_ntt_c 6s 4s +50%
mlk_polyvec_frombytes 6s 2s +200%
poly_compress_d10_native_x86_64 6s 2s +200%
poly_decompress_d4_native_x86_64 6s 4s +50%
keccakf1600_permute_native 5s 5s +0%
kem_enc_derand 5s 2s +150%
mlk_gen_matrix 5s 4s +25%
mlk_invntt_layer 5s 6s -17%
mlk_keccak_squeeze_once 5s 7s -29%
mlk_keccakf1600_permute 5s 4s +25%
mlk_poly_add 5s 6s -17%
mlk_poly_compress_d10_c 5s 6s -17%
mlk_poly_compress_d4_c 5s 3s +67%
mlk_shake256x4 5s 4s +25%
poly_decompress_d10_native_x86_64 5s 3s +67%
intt_native_aarch64 4s 4s +0%
intt_native_x86_64 4s 4s +0%
keccakf1600x4_xor_bytes_native 4s 4s +0%
kem_enc 4s 2s +100%
mlk_ct_memcmp 4s 2s +100%
mlk_gen_matrix_serial 4s 2s +100%
mlk_keccak_absorb_once 4s 5s -20%
mlk_poly_getnoise_eta1122_4x 4s 3s +33%
mlk_poly_getnoise_eta1_4x 4s 3s +33%
mlk_poly_reduce_c 4s 1s +300%
mlk_poly_tobytes_c 4s 3s +33%
mlk_polymat_permute_bitrev_to_custom 4s 5s -20%
mlk_polyvec_tobytes 4s 4s +0%
mlk_scalar_compress_d4 4s 4s +0%
mlk_scalar_decompress_d11 4s 2s +100%
mlk_sha3_512 4s 2s +100%
mlk_shake128x4_absorb_once 4s 4s +0%
ntt_native_x86_64 4s 4s +0%
poly_frombytes_native_x86_64 4s 4s +0%
poly_mulcache_compute_native_x86_64 4s 1s +300%
poly_reduce_native_aarch64 4s 2s +100%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 3s +33%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 4s 3s +33%
kem_check_sk 3s 1s +200%
mlk_barrett_reduce 3s 3s +0%
mlk_check_pct 3s 3s +0%
mlk_ct_cmask_neg_i16 3s 2s +50%
mlk_ct_cmov_zero 3s 3s +0%
mlk_keypair_getnoise 3s 2s +50%
mlk_montgomery_reduce 3s 1s +200%
mlk_poly_cbd_eta1 3s 1s +200%
mlk_poly_compress_d10_native 3s 1s +200%
mlk_poly_compress_d11 3s 2s +50%
mlk_poly_compress_d4_native 3s 3s +0%
mlk_poly_decompress_d10_c 3s 5s -40%
mlk_poly_decompress_d11_c 3s 2s +50%
mlk_poly_decompress_d4_c 3s 2s +50%
mlk_poly_decompress_d5 3s 2s +50%
mlk_poly_decompress_d5_c 3s 2s +50%
mlk_poly_getnoise_eta1_4x_native 3s 2s +50%
mlk_poly_getnoise_eta2 3s 1s +200%
mlk_poly_invntt_tomont 3s 2s +50%
mlk_poly_mulcache_compute_c 3s 3s +0%
mlk_poly_sub 3s 4s -25%
mlk_poly_tobytes 3s 4s -25%
mlk_poly_tomont_c 3s 4s -25%
mlk_poly_tomsg 3s 1s +200%
mlk_polyvec_basemul_acc_montgomery_cached 3s 3s +0%
mlk_polyvec_decompress_du 3s 1s +200%
mlk_polyvec_invntt_tomont 3s 3s +0%
mlk_polyvec_mulcache_compute 3s 3s +0%
mlk_polyvec_ntt 3s 3s +0%
mlk_polyvec_tomont 3s 3s +0%
mlk_rej_uniform 3s 2s +50%
mlk_scalar_decompress_d10 3s 2s +50%
mlk_scalar_decompress_d5 3s 1s +200%
mlk_shake256 3s 2s +50%
mlk_value_barrier_u8 3s 1s +200%
nttunpack_native_x86_64 3s 3s +0%
poly_tobytes_native_aarch64 3s 3s +0%
poly_tomont_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 3s 3s +0%
sys_check_capability 3s 2s +50%
keccak_f1600_x1_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 3s -33%
kem_check_pk 2s 2s +0%
kem_keypair 2s 1s +100%
kem_keypair_derand 2s 3s -33%
mlk_ct_get_optblocker_i32 2s 2s +0%
mlk_keccakf1600_extract_bytes (big endian) 2s 3s -33%
mlk_keccakf1600_xor_bytes 2s 2s +0%
mlk_keccakf1600x4_xor_bytes 2s 1s +100%
mlk_matvec_mul 2s 1s +100%
mlk_poly_compress_d10 2s 2s +0%
mlk_poly_compress_d11_c 2s 3s -33%
mlk_poly_compress_d11_native 2s 2s +0%
mlk_poly_compress_d5_c 2s 2s +0%
mlk_poly_compress_d5_native 2s 2s +0%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d11 2s 1s +100%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_du 2s 3s -33%
mlk_poly_frombytes 2s 1s +100%
mlk_poly_frombytes_c 2s 3s -33%
mlk_poly_invntt_tomont_c 2s 4s -50%
mlk_poly_ntt 2s 3s -33%
mlk_poly_tobytes_native 2s 3s -33%
mlk_poly_tomont 2s 3s -33%
mlk_poly_tomont_native 2s 1s +100%
mlk_polyvec_compress_du 2s 2s +0%
mlk_polyvec_permute_bitrev_to_custom 2s 4s -50%
mlk_polyvec_permute_bitrev_to_custom_native 2s 3s -33%
mlk_polyvec_reduce 2s 2s +0%
mlk_scalar_compress_d1 2s 2s +0%
mlk_scalar_compress_d10 2s 2s +0%
mlk_scalar_compress_d5 2s 2s +0%
mlk_scalar_decompress_d4 2s 2s +0%
mlk_scalar_signed_to_unsigned_q 2s 2s +0%
mlk_sha3_256 2s 2s +0%
mlk_value_barrier_u32 2s 3s -33%
ntt_native_aarch64 2s 3s -33%
poly_compress_d4_native_x86_64 2s 1s +100%
poly_compress_d5_native_x86_64 2s 2s +0%
poly_decompress_d11_native_x86_64 2s 3s -33%
poly_decompress_d5_native_x86_64 2s 2s +0%
poly_getnoise_eta1122_4x_native 2s 4s -50%
poly_invntt_tomont_native 2s 3s -33%
poly_mulcache_compute_native_aarch64 2s 3s -33%
poly_reduce_native_x86_64 2s 1s +100%
poly_tobytes_native_x86_64 2s 2s +0%
poly_tomont_native_x86_64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 4s -50%
rej_uniform_native 2s 2s +0%
rej_uniform_native_aarch64 2s 3s -33%
rej_uniform_native_x86_64 2s 1s +100%
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 243s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 3s -67%
keccakf1600x4_extract_bytes_native 1s 2s -50%
mlk_ct_cmask_nonzero_u16 1s 3s -67%
mlk_ct_cmask_nonzero_u8 1s 3s -67%
mlk_ct_get_optblocker_u32 1s 2s -50%
mlk_ct_get_optblocker_u8 1s 3s -67%
mlk_ct_sel_int16 1s 2s -50%
mlk_ct_sel_uint8 1s 1s +0%
mlk_keccakf1600_extract_bytes 1s 1s +0%
mlk_keccakf1600_xor_bytes (big endian) 1s 2s -50%
mlk_keccakf1600x4_extract_bytes 1s 1s +0%
mlk_keccakf1600x4_permute 1s 1s +0%
mlk_poly_cbd_eta2 1s 1s +0%
mlk_poly_compress_d4 1s 2s -50%
mlk_poly_compress_d5 1s 4s -75%
mlk_poly_decompress_d10 1s 1s +0%
mlk_poly_decompress_d11_native 1s 4s -75%
mlk_poly_decompress_d5_native 1s 3s -67%
mlk_poly_decompress_dv 1s 5s -80%
mlk_poly_mulcache_compute 1s 1s +0%
mlk_poly_mulcache_compute_native 1s 2s -50%
mlk_poly_reduce 1s 2s -50%
mlk_scalar_compress_d11 1s 2s -50%
mlk_shake128_absorb_once 1s 4s -75%
mlk_shake128_squeezeblocks 1s 2s -50%
mlk_shake128x4_squeezeblocks 1s 2s -50%
mlk_value_barrier_i32 1s 2s -50%
poly_compress_d11_native_x86_64 1s 3s -67%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 1s 2s -50%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 1s 3s -67%

@oqs-bot
Copy link
Contributor

oqs-bot commented Mar 12, 2026

CBMC Results (ML-KEM-1024)

⚠️ Attention Required

Proof Status Current Previous Change
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 275s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1187s 1744s -31.9%
polyvec_basemul_acc_montgomery_cached_native 122s 127s -4%
mlk_keccak_squeezeblocks_x4 119s 124s -4%
mlk_indcpa_keypair_derand 116s 378s -69%
mlk_rej_uniform_c 71s 70s +1%
mlk_polyvec_basemul_acc_montgomery_cached_c 69s 70s -1%
mlk_poly_rej_uniform 35s 36s -3%
poly_ntt_native 32s 27s +19%
mlk_polyvec_add 25s 25s +0%
keccakf1600x4_permute_native_x4 20s 21s -5%
mlk_indcpa_dec 17s 15s +13%
mlk_poly_reduce_native 15s 15s +0%
mlk_ntt_layer 14s 11s +27%
mlk_poly_decompress_d11_native 14s 12s +17%
mlk_polyvec_ntt 14s 14s +0%
mlk_poly_decompress_d5_native 11s 14s -21%
mlk_polymat_permute_bitrev_to_custom 10s 10s +0%
mlk_poly_compress_d11_c 9s 8s +12%
mlk_poly_frombytes_native 9s 7s +29%
mlk_fqmul 8s 6s +33%
mlk_poly_frommsg 8s 11s -27%
mlk_gen_matrix 7s 9s -22%
mlk_keccak_absorb_once_x4 7s 9s -22%
mlk_keccak_squeeze_once 7s 7s +0%
mlk_ntt_butterfly_block 7s 7s +0%
mlk_poly_mulcache_compute_c 7s 4s +75%
kem_check_pk 6s 4s +50%
kem_dec 6s 6s +0%
mlk_gen_matrix_serial 6s 6s +0%
mlk_keccak_squeezeblocks 6s 6s +0%
mlk_poly_rej_uniform_x4 6s 8s -25%
mlk_shake256x4 6s 5s +20%
keccakf1600_permute_native 5s 7s -29%
mlk_poly_add 5s 4s +25%
mlk_poly_tomont_c 5s 2s +150%
poly_compress_d5_native_x86_64 5s 4s +25%
poly_frombytes_native_x86_64 5s 4s +25%
intt_native_x86_64 4s 1s +300%
keccak_f1600_x1_native_aarch64 4s 2s +100%
keccakf1600x4_xor_bytes_native 4s 4s +0%
mlk_check_pct 4s 3s +33%
mlk_ct_cmask_nonzero_u8 4s 3s +33%
mlk_invntt_layer 4s 4s +0%
mlk_keccak_absorb_once 4s 4s +0%
mlk_keccakf1600_permute 4s 4s +0%
mlk_poly_decompress_d4_c 4s 3s +33%
mlk_poly_frombytes 4s 3s +33%
mlk_poly_getnoise_eta1_4x_native 4s 1s +300%
mlk_poly_getnoise_eta2 4s 3s +33%
mlk_poly_sub 4s 2s +100%
mlk_polyvec_mulcache_compute 4s 2s +100%
mlk_polyvec_permute_bitrev_to_custom_native 4s 4s +0%
mlk_scalar_decompress_d11 4s 1s +300%
mlk_scalar_decompress_d5 4s 2s +100%
mlk_sha3_256 4s 3s +33%
poly_decompress_d4_native_x86_64 4s 3s +33%
poly_decompress_d5_native_x86_64 4s 5s -20%
rej_uniform_native_aarch64 4s 1s +300%
intt_native_aarch64 3s 1s +200%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccak_f1600_x4_native_avx2 3s 1s +200%
kem_check_sk 3s 4s -25%
kem_enc 3s 2s +50%
kem_enc_derand 3s 4s -25%
kem_keypair 3s 1s +200%
mlk_ct_cmask_neg_i16 3s 1s +200%
mlk_ct_get_optblocker_u8 3s 2s +50%
mlk_ct_memcmp 3s 6s -50%
mlk_ct_sel_int16 3s 3s +0%
mlk_keccakf1600_xor_bytes (big endian) 3s 2s +50%
mlk_keccakf1600x4_extract_bytes 3s 1s +200%
mlk_poly_cbd_eta2 3s 4s -25%
mlk_poly_compress_d4 3s 2s +50%
mlk_poly_compress_d5_c 3s 3s +0%
mlk_poly_decompress_d11_c 3s 2s +50%
mlk_poly_decompress_dv 3s 1s +200%
mlk_poly_frombytes_c 3s 1s +200%
mlk_poly_getnoise_eta1_4x 3s 3s +0%
mlk_poly_invntt_tomont_c 3s 5s -40%
mlk_poly_tobytes 3s 2s +50%
mlk_poly_tobytes_c 3s 2s +50%
mlk_poly_tobytes_native 3s 4s -25%
mlk_poly_tomsg 3s 2s +50%
mlk_polyvec_basemul_acc_montgomery_cached 3s 3s +0%
mlk_polyvec_compress_du 3s 3s +0%
mlk_polyvec_frombytes 3s 2s +50%
mlk_polyvec_invntt_tomont 3s 3s +0%
mlk_polyvec_permute_bitrev_to_custom 3s 3s +0%
mlk_polyvec_tobytes 3s 1s +200%
mlk_scalar_compress_d5 3s 3s +0%
mlk_scalar_decompress_d4 3s 2s +50%
mlk_shake128x4_squeezeblocks 3s 2s +50%
poly_compress_d11_native_x86_64 3s 4s -25%
poly_decompress_d11_native_x86_64 3s 4s -25%
poly_mulcache_compute_native_x86_64 3s 4s -25%
poly_reduce_native_aarch64 3s 1s +200%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 1s +200%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 3s 2s +50%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 3s -33%
kem_keypair_derand 2s 5s -60%
mlk_barrett_reduce 2s 3s -33%
mlk_ct_cmask_nonzero_u16 2s 2s +0%
mlk_ct_cmov_zero 2s 4s -50%
mlk_ct_get_optblocker_i32 2s 3s -33%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_sel_uint8 2s 2s +0%
mlk_keccakf1600_extract_bytes 2s 1s +100%
mlk_keccakf1600_extract_bytes (big endian) 2s 1s +100%
mlk_keccakf1600x4_xor_bytes 2s 3s -33%
mlk_matvec_mul 2s 1s +100%
mlk_poly_cbd_eta1 2s 2s +0%
mlk_poly_compress_d10 2s 4s -50%
mlk_poly_compress_d10_c 2s 3s -33%
mlk_poly_compress_d10_native 2s 3s -33%
mlk_poly_compress_d11 2s 1s +100%
mlk_poly_compress_d11_native 2s 3s -33%
mlk_poly_compress_du 2s 3s -33%
mlk_poly_decompress_d10 2s 5s -60%
mlk_poly_decompress_d10_c 2s 2s +0%
mlk_poly_decompress_d10_native 2s 2s +0%
mlk_poly_decompress_d11 2s 3s -33%
mlk_poly_decompress_d4_native 2s 1s +100%
mlk_poly_decompress_d5 2s 2s +0%
mlk_poly_decompress_du 2s 4s -50%
mlk_poly_invntt_tomont 2s 3s -33%
mlk_poly_mulcache_compute 2s 3s -33%
mlk_poly_ntt 2s 3s -33%
mlk_poly_tomont 2s 3s -33%
mlk_poly_tomont_native 2s 2s +0%
mlk_polyvec_decompress_du 2s 3s -33%
mlk_polyvec_reduce 2s 4s -50%
mlk_polyvec_tomont 2s 2s +0%
mlk_rej_uniform 2s 1s +100%
mlk_scalar_compress_d1 2s 3s -33%
mlk_scalar_compress_d10 2s 3s -33%
mlk_scalar_compress_d11 2s 1s +100%
mlk_scalar_decompress_d10 2s 3s -33%
mlk_scalar_signed_to_unsigned_q 2s 2s +0%
mlk_sha3_512 2s 1s +100%
mlk_shake128_absorb_once 2s 2s +0%
mlk_shake128_squeezeblocks 2s 1s +100%
mlk_shake256 2s 2s +0%
mlk_value_barrier_i32 2s 3s -33%
mlk_value_barrier_u8 2s 2s +0%
ntt_native_x86_64 2s 3s -33%
nttunpack_native_x86_64 2s 3s -33%
poly_compress_d10_native_x86_64 2s 2s +0%
poly_compress_d4_native_x86_64 2s 1s +100%
poly_decompress_d10_native_x86_64 2s 3s -33%
poly_invntt_tomont_native 2s 4s -50%
poly_reduce_native_x86_64 2s 1s +100%
poly_tobytes_native_x86_64 2s 2s +0%
poly_tomont_native_aarch64 2s 1s +100%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 2s 4s -50%
rej_uniform_native 2s 3s -33%
sys_check_capability 2s 2s +0%
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 275s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
keccak_f1600_x1_native_aarch64_v84a 1s 1s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 3s -67%
mlk_keccakf1600_xor_bytes 1s 4s -75%
mlk_keccakf1600x4_permute 1s 1s +0%
mlk_keypair_getnoise 1s 1s +0%
mlk_montgomery_reduce 1s 4s -75%
mlk_poly_compress_d4_c 1s 2s -50%
mlk_poly_compress_d4_native 1s 1s +0%
mlk_poly_compress_d5 1s 4s -75%
mlk_poly_compress_d5_native 1s 3s -67%
mlk_poly_compress_dv 1s 3s -67%
mlk_poly_decompress_d4 1s 3s -67%
mlk_poly_decompress_d5_c 1s 2s -50%
mlk_poly_getnoise_eta1122_4x 1s 2s -50%
mlk_poly_mulcache_compute_native 1s 2s -50%
mlk_poly_ntt_c 1s 3s -67%
mlk_poly_reduce 1s 2s -50%
mlk_poly_reduce_c 1s 1s +0%
mlk_scalar_compress_d4 1s 1s +0%
mlk_shake128x4_absorb_once 1s 3s -67%
mlk_value_barrier_u32 1s 2s -50%
ntt_native_aarch64 1s 3s -67%
poly_getnoise_eta1122_4x_native 1s 2s -50%
poly_mulcache_compute_native_aarch64 1s 2s -50%
poly_tobytes_native_aarch64 1s 4s -75%
poly_tomont_native_x86_64 1s 1s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 1s 2s -50%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 1s 4s -75%
rej_uniform_native_x86_64 1s 3s -67%

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 13, 2026
Copy link
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

@mkannwischer
Copy link
Contributor Author

mkannwischer commented Mar 13, 2026

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

The main question here is if we want to add the new API in mlkem_native.h or not. If we don't, we can't test the API in the standard test_mlkem.c, but we could add it in a separate test that includes kem.h, but not mlkem_native.h.
The purpose of 0a01cc4 was to get something to work first, so we can discuss how we want to proceed.

I agree with you that we don't want to keep it as is right now.

@hanno-becker
Copy link
Contributor

Seeing that you also observed a slowdown on x86, I wonder if we should treat the incremental API as internal by default and only expose it in the public API if some new option MLK_CONFIG_ENABLE_MLKEM_BRAID it set?

@hanno-becker hanno-becker force-pushed the incremental-enc-api branch 3 times, most recently from 30af7b8 to 22448d5 Compare March 16, 2026 12:10
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 16, 2026
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 16, 2026
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 16, 2026
mkannwischer and others added 12 commits March 16, 2026 14:53
Split K-PKE.Encrypt and ML-KEM.Encaps into two phases (u and v) to
support protocols like MLKEMBraid that transmit large KEM components
in parallel over bandwidth-constrained channels.

CPA level (indcpa):
- mlk_indcpa_enc_u: computes ct_u from ek_seed, outputs intermediate
  state (sp, epp)
- mlk_indcpa_enc_v: computes ct_v from ek_vector using intermediate
  state from enc_u

CCA KEM level (kem):
- mlk_kem_enc_derand_u: FO transform + enc_u, outputs shared secret
  and intermediate state; only needs ek_seed and H(pk)
- mlk_kem_enc_v: modulus check on ek_vector + enc_v; only needs
  ek_vector

The test verifies that the incremental API produces identical
ciphertexts and shared secrets as the standard API across all three
parameter sets.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Use mlk_kem_enc_derand_u + mlk_kem_enc_v as the single implementation
for both the standard and incremental encapsulation API. Serialize the
intermediate state (sp, epp) via 16-bit little-endian encoding into
separate buffers sp_serial[MLKEM_POLYVEC16_BYTES] and
epp_serial[MLKEM_POLY16_BYTES].

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Add CBMC contracts for mlk_indcpa_enc_u and mlk_indcpa_enc_v, including
an epp coefficient bound postcondition on enc_u (array_abs_bound ETA2+1)
and a matching precondition on enc_v (array_abs_bound 16).

Serialize epp as 4-bit nibbles (ETA2 - x) in 128 bytes instead of
16-bit LE (512 bytes), providing a natural coefficient bound on
deserialization. Revert mlk_kem_enc_derand to call mlk_indcpa_enc
directly, avoiding unnecessary serialization overhead.

Add CBMC proofs for indcpa_enc_u, indcpa_enc_v, kem_enc_derand_u,
and kem_enc_v. Update the indcpa_enc proof to compose enc_u and enc_v.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Change mlk_kem_enc_derand_u and mlk_kem_enc_v from MLK_INTERNAL_API
to MLK_EXTERNAL_API so they are not static in monolithic builds.
Add -Wno-unused-function to the monolithic_build_multilevel_native
example (matching mldsa-native) since those examples don't exercise
the incremental API.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 16, 2026
Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 keypair 28980 cycles 28131 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Graviton4 (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 encaps 41585 cycles 40122 cycles 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Graviton3 (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 0a0a167 Previous: 712709d Ratio
ML-KEM-512 encaps 47075 cycles 44596 cycles 1.06

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark this PR should be benchmarked in CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants