Add HOL Light pointwise-acc multiplication proofs for AArch64 and x86_64 by jakemas · Pull Request #1010 · pq-code-package/mldsa-native

jakemas · 2026-04-01T21:23:26Z

Resolves:

Summary

Port ML-DSA pointwise multiplication-accumulation (l=4,5,7) HOL Light proofs from s2n-bignum for both AArch64 (NEON) and x86_64 (AVX2) s2n-bignum PR: ML-DSA x86 AVX2 and Aarch64 pointwise_acc_l7 HOL-Light proofs awslabs/s2n-bignum#373
Includes constant-time and memory safety proofs for both architectures
Adds CBMC contracts for all pointwise_acc functions across all header files

Dependencies

Depends on Add HOL Light pointwise multiplication proofs for AArch64 and x86_64 #1006 (pointwise multiplication proofs) being merged first

oqs-bot · 2026-04-01T22:46:16Z

CBMC Results (ML-DSA-44)

Full Results (181 proofs)

Proof	Status	Current	Previous	Change
`TOTAL`	✅	2167s	2020s	+7.3%
`mld_attempt_signature_generation`	✅	266s	251s	+6%
`polyvecl_pointwise_acc_montgomery_c`	✅	238s	205s	+16%
`sign_verify_internal`	✅	185s	179s	+3%
`poly_pointwise_montgomery_c`	✅	173s	150s	+15%
`rej_uniform_native`	✅	151s	144s	+5%
`mld_invntt_layer`	✅	94s	87s	+8%
`mld_ct_memcmp`	✅	83s	77s	+8%
`mld_ntt_layer`	✅	56s	54s	+4%
`fqmul`	✅	23s	18s	+28%
`poly_chknorm_c`	✅	21s	21s	+0%
`rej_uniform`	✅	21s	21s	+0%
`sign_signature_internal`	✅	20s	19s	+5%
`polyvec_matrix_expand`	✅	19s	20s	-5%
`poly_uniform_eta_4x`	✅	18s	20s	-10%
`polyeta_unpack`	✅	18s	16s	+12%
`polymat_permute_bitrev_to_custom`	✅	17s	14s	+21%
`rej_uniform_c`	✅	17s	14s	+21%
`mld_compute_t0_t1_tr_from_sk_components`	✅	16s	12s	+33%
`poly_uniform_4x`	✅	15s	16s	-6%
`polyt0_unpack`	✅	15s	14s	+7%
`polyvec_matrix_expand_serial`	✅	14s	11s	+27%
`polyz_unpack_c`	✅	14s	11s	+27%
`mld_ntt_butterfly_block`	✅	13s	12s	+8%
`keccakf1600x4_permute_native`	✅	12s	14s	-14%
`poly_add`	✅	12s	14s	-14%
`polyvec_matrix_pointwise_montgomery`	✅	12s	15s	-20%
`polyveck_power2round`	✅	12s	10s	+20%
`keccak_absorb_once_x4`	✅	9s	11s	-18%
`keccak_squeezeblocks_x4`	✅	9s	6s	+50%
`keccakf1600_permute`	✅	9s	8s	+12%
`keccakf1600_permute_native`	✅	9s	9s	+0%
`sign`	✅	9s	5s	+80%
`sign_pk_from_sk`	✅	9s	7s	+29%
`poly_invntt_tomont_c`	✅	8s	7s	+14%
`keccak_absorb`	✅	7s	7s	+0%
`mld_check_pct`	✅	7s	8s	-12%
`mld_prepare_domain_separation_prefix`	✅	7s	7s	+0%
`poly_uniform_gamma1`	✅	7s	3s	+133%
`polyveck_ntt`	✅	7s	5s	+40%
`rej_eta_native`	✅	7s	6s	+17%
`unpack_sk`	✅	7s	6s	+17%
`mld_polyvecl_permute_bitrev_to_custom_native`	✅	6s	8s	-25%
`mld_sample_s1_s2`	✅	6s	4s	+50%
`poly_power2round`	✅	6s	5s	+20%
`polyt1_unpack`	✅	6s	1s	+500%
`polyveck_add`	✅	6s	8s	-25%
`polyveck_sub`	✅	6s	5s	+20%
`polyveck_use_hint`	✅	6s	6s	+0%
`polyvecl_unpack_z`	✅	6s	4s	+50%
`sign_keypair`	✅	6s	3s	+100%
`sign_open`	✅	6s	4s	+50%
`sign_signature`	✅	6s	4s	+50%
`unpack_hints`	✅	6s	5s	+20%
`intt_native_x86_64`	✅	5s	2s	+150%
`mld_compute_pack_z`	✅	5s	4s	+25%
`mld_ct_cmask_nonzero_u32`	✅	5s	2s	+150%
`mld_ct_cmask_nonzero_u8`	✅	5s	5s	+0%
`pack_sk`	✅	5s	2s	+150%
`poly_caddq`	✅	5s	4s	+25%
`poly_caddq_c`	✅	5s	5s	+0%
`poly_challenge`	✅	5s	4s	+25%
`poly_ntt`	✅	5s	4s	+25%
`poly_ntt_c`	✅	5s	2s	+150%
`poly_sub`	✅	5s	4s	+25%
`poly_uniform`	✅	5s	4s	+25%
`poly_uniform_eta`	✅	5s	6s	-17%
`poly_use_hint`	✅	5s	3s	+67%
`polyveck_caddq`	✅	5s	4s	+25%
`polyveck_chknorm`	✅	5s	6s	-17%
`polyveck_decompose`	✅	5s	6s	-17%
`polyveck_pointwise_poly_montgomery`	✅	5s	4s	+25%
`polyveck_shiftl`	✅	5s	7s	-29%
`polyveck_unpack_t0`	✅	5s	4s	+25%
`polyvecl_ntt`	✅	5s	4s	+25%
`polyvecl_permute_bitrev_to_custom`	✅	5s	3s	+67%
`polyz_unpack`	✅	5s	2s	+150%
`rej_eta`	✅	5s	3s	+67%
`shake256_squeeze`	✅	5s	1s	+400%
`sign_signature_extmu`	✅	5s	2s	+150%
`keccak_finalize`	✅	4s	2s	+100%
`keccakf1600_xor_bytes (big endian)`	✅	4s	2s	+100%
`keccakf1600x4_extract_bytes`	✅	4s	3s	+33%
`mld_h`	✅	4s	3s	+33%
`mld_sample_s1_s2_serial`	✅	4s	4s	+0%
`ntt_native_aarch64`	✅	4s	2s	+100%
`poly_chknorm_native_aarch64`	✅	4s	3s	+33%
`poly_decompose_c`	✅	4s	4s	+0%
`poly_pointwise_montgomery`	✅	4s	1s	+300%
`poly_use_hint_c`	✅	4s	6s	-33%
`poly_use_hint_native`	✅	4s	3s	+33%
`polyeta_pack`	✅	4s	3s	+33%
`polyveck_make_hint`	✅	4s	6s	-33%
`polyveck_pack_t0`	✅	4s	4s	+0%
`polyveck_pointwise_poly_montgomery_s2`	✅	4s	5s	-20%
`polyveck_pointwise_poly_montgomery_t0`	✅	4s	5s	-20%
`polyvecl_chknorm`	✅	4s	5s	-20%
`polyvecl_uniform_gamma1_serial`	✅	4s	4s	+0%
`polyz_pack`	✅	4s	4s	+0%
`rej_eta_c`	✅	4s	6s	-33%
`shake256`	✅	4s	2s	+100%
`shake256_absorb`	✅	4s	2s	+100%
`sign_keypair_internal`	✅	4s	5s	-20%
`sign_signature_pre_hash_shake256`	✅	4s	4s	+0%
`sign_verify_pre_hash_shake256`	✅	4s	4s	+0%
`sys_check_capability`	✅	4s	4s	+0%
`unpack_pk`	✅	4s	5s	-20%
`unpack_sig`	✅	4s	4s	+0%
`use_hint`	✅	4s	1s	+300%
`caddq`	✅	3s	4s	-25%
`fqscale`	✅	3s	4s	-25%
`keccak_init`	✅	3s	2s	+50%
`make_hint`	✅	3s	4s	-25%
`mld_ct_get_optblocker_u32`	✅	3s	2s	+50%
`mld_ct_sel_int32`	✅	3s	3s	+0%
`mld_value_barrier_u32`	✅	3s	5s	-40%
`pack_pk`	✅	3s	6s	-50%
`pack_sig_c_h`	✅	3s	4s	-25%
`pack_sig_z`	✅	3s	4s	-25%
`pointwise_native_aarch64`	✅	3s	-	new
`poly_caddq_native_aarch64`	✅	3s	4s	-25%
`poly_chknorm_native`	✅	3s	4s	-25%
`poly_decompose`	✅	3s	2s	+50%
`poly_decompose_native`	✅	3s	5s	-40%
`poly_invntt_tomont_native`	✅	3s	2s	+50%
`poly_ntt_native`	✅	3s	4s	-25%
`poly_pointwise_montgomery_native`	✅	3s	4s	-25%
`poly_shiftl`	✅	3s	5s	-40%
`polyt0_pack`	✅	3s	3s	+0%
`polyveck_invntt_tomont`	✅	3s	3s	+0%
`polyveck_pack_eta`	✅	3s	3s	+0%
`polyveck_reduce`	✅	3s	6s	-50%
`polyveck_unpack_eta`	✅	3s	2s	+50%
`polyvecl_pack_eta`	✅	3s	6s	-50%
`polyvecl_pointwise_acc_montgomery`	✅	3s	4s	-25%
`polyvecl_pointwise_acc_montgomery_native`	✅	3s	3s	+0%
`polyvecl_uniform_gamma1`	✅	3s	4s	-25%
`polyvecl_unpack_eta`	✅	3s	4s	-25%
`polyw1_pack`	✅	3s	3s	+0%
`shake128_absorb`	✅	3s	6s	-50%
`shake128_finalize`	✅	3s	2s	+50%
`shake128_init`	✅	3s	2s	+50%
`shake128_release`	✅	3s	2s	+50%
`shake256_finalize`	✅	3s	2s	+50%
`shake256_init`	✅	3s	2s	+50%
`shake256_release`	✅	3s	2s	+50%
`shake256x4_squeezeblocks`	✅	3s	4s	-25%
`sign_signature_pre_hash_internal`	✅	3s	3s	+0%
`sign_verify`	✅	3s	7s	-57%
`sign_verify_extmu`	✅	3s	4s	-25%
`sign_verify_pre_hash_internal`	✅	3s	6s	-50%
`keccak_squeeze`	✅	2s	1s	+100%
`keccakf1600_extract_bytes (big endian)`	✅	2s	1s	+100%
`keccakf1600_xor_bytes`	✅	2s	4s	-50%
`keccakf1600x4_xor_bytes`	✅	2s	1s	+100%
`mld_ct_abs_i32`	✅	2s	3s	-33%
`mld_ct_cmask_neg_i32`	✅	2s	2s	+0%
`mld_ct_get_optblocker_i64`	✅	2s	3s	-33%
`mld_keccakf1600_extract_bytes`	✅	2s	2s	+0%
`mld_value_barrier_i64`	✅	2s	1s	+100%
`montgomery_reduce`	✅	2s	2s	+0%
`ntt_native_x86_64`	✅	2s	2s	+0%
`pointwise_native_x86_64`	✅	2s	-	new
`poly_caddq_native`	✅	2s	5s	-60%
`poly_chknorm`	✅	2s	4s	-50%
`poly_invntt_tomont`	✅	2s	4s	-50%
`poly_make_hint`	✅	2s	3s	-33%
`poly_reduce`	✅	2s	2s	+0%
`poly_uniform_gamma1_4x`	✅	2s	6s	-67%
`polyt1_pack`	✅	2s	1s	+100%
`polyveck_pack_w1`	✅	2s	2s	+0%
`polyz_unpack_native`	✅	2s	3s	-33%
`power2round`	✅	2s	3s	-33%
`reduce32`	✅	2s	2s	+0%
`shake128x4_absorb_once`	✅	2s	2s	+0%
`shake128x4_squeezeblocks`	✅	2s	1s	+100%
`shake256x4_absorb_once`	✅	2s	2s	+0%
`decompose`	✅	1s	4s	-75%
`keccakf1600x4_permute`	✅	1s	2s	-50%
`mld_ct_get_optblocker_u8`	✅	1s	3s	-67%
`mld_value_barrier_u8`	✅	1s	3s	-67%
`shake128_squeeze`	✅	1s	1s	+0%

oqs-bot · 2026-04-01T22:47:47Z

CBMC Results (ML-DSA-65)

Full Results (181 proofs)

Proof	Status	Current	Previous	Change
`TOTAL`	✅	2421s	2543s	-4.8%
`sign_verify_internal`	✅	331s	350s	-5%
`mld_attempt_signature_generation`	✅	281s	295s	-5%
`polyvecl_pointwise_acc_montgomery_c`	✅	188s	201s	-6%
`poly_pointwise_montgomery_c`	✅	153s	165s	-7%
`rej_uniform_native`	✅	144s	151s	-5%
`polyvec_matrix_expand`	✅	123s	127s	-3%
`mld_invntt_layer`	✅	96s	99s	-3%
`mld_ct_memcmp`	✅	76s	82s	-7%
`polyvec_matrix_expand_serial`	✅	66s	69s	-4%
`mld_ntt_layer`	✅	55s	58s	-5%
`polymat_permute_bitrev_to_custom`	✅	31s	32s	-3%
`sign_signature_internal`	✅	26s	25s	+4%
`mld_compute_t0_t1_tr_from_sk_components`	✅	25s	25s	+0%
`rej_uniform`	✅	23s	23s	+0%
`poly_chknorm_c`	✅	21s	20s	+5%
`fqmul`	✅	18s	20s	-10%
`poly_uniform_eta_4x`	✅	15s	16s	-6%
`polyveck_decompose`	✅	15s	16s	-6%
`rej_uniform_c`	✅	15s	17s	-12%
`poly_uniform_4x`	✅	14s	16s	-12%
`polyt0_unpack`	✅	14s	16s	-12%
`polyvec_matrix_pointwise_montgomery`	✅	13s	14s	-7%
`polyvecl_chknorm`	✅	13s	13s	+0%
`keccakf1600x4_permute_native`	✅	12s	14s	-14%
`polyveck_sub`	✅	12s	12s	+0%
`mld_ntt_butterfly_block`	✅	11s	14s	-21%
`poly_add`	✅	11s	12s	-8%
`polyveck_power2round`	✅	11s	10s	+10%
`keccak_absorb`	✅	10s	6s	+67%
`keccakf1600_permute_native`	✅	10s	8s	+25%
`mld_check_pct`	✅	10s	9s	+11%
`polyveck_pointwise_poly_montgomery`	✅	10s	8s	+25%
`keccak_absorb_once_x4`	✅	9s	9s	+0%
`mld_polyvecl_permute_bitrev_to_custom_native`	✅	9s	9s	+0%
`polyveck_add`	✅	9s	9s	+0%
`polyveck_ntt`	✅	9s	10s	-10%
`keccakf1600_permute`	✅	8s	7s	+14%
`polyveck_make_hint`	✅	8s	6s	+33%
`polyveck_use_hint`	✅	8s	12s	-33%
`unpack_sk`	✅	8s	9s	-11%
`keccak_squeezeblocks_x4`	✅	7s	7s	+0%
`mld_sample_s1_s2`	✅	7s	4s	+75%
`poly_caddq_c`	✅	7s	6s	+17%
`poly_invntt_tomont_c`	✅	7s	10s	-30%
`poly_uniform_gamma1`	✅	7s	4s	+75%
`polyveck_caddq`	✅	7s	7s	+0%
`polyveck_pointwise_poly_montgomery_t0`	✅	7s	6s	+17%
`polyveck_shiftl`	✅	7s	7s	+0%
`sign`	✅	7s	7s	+0%
`mld_compute_pack_z`	✅	6s	7s	-14%
`mld_h`	✅	6s	7s	-14%
`poly_decompose_c`	✅	6s	7s	-14%
`poly_uniform`	✅	6s	7s	-14%
`polyveck_invntt_tomont`	✅	6s	8s	-25%
`polyveck_pointwise_poly_montgomery_s2`	✅	6s	7s	-14%
`polyveck_reduce`	✅	6s	8s	-25%
`polyvecl_ntt`	✅	6s	9s	-33%
`sign_pk_from_sk`	✅	6s	8s	-25%
`intt_native_x86_64`	✅	5s	6s	-17%
`keccakf1600_extract_bytes (big endian)`	✅	5s	3s	+67%
`mld_sample_s1_s2_serial`	✅	5s	5s	+0%
`pack_sig_c_h`	✅	5s	4s	+25%
`pointwise_native_x86_64`	✅	5s	-	new
`poly_challenge`	✅	5s	5s	+0%
`poly_pointwise_montgomery`	✅	5s	2s	+150%
`poly_power2round`	✅	5s	5s	+0%
`poly_uniform_eta`	✅	5s	5s	+0%
`poly_uniform_gamma1_4x`	✅	5s	4s	+25%
`polyeta_pack`	✅	5s	3s	+67%
`polyeta_unpack`	✅	5s	4s	+25%
`polyt0_pack`	✅	5s	4s	+25%
`polyveck_unpack_t0`	✅	5s	4s	+25%
`polyvecl_pointwise_acc_montgomery_native`	✅	5s	7s	-29%
`rej_eta_native`	✅	5s	6s	-17%
`shake256_init`	✅	5s	4s	+25%
`sign_keypair_internal`	✅	5s	5s	+0%
`sign_open`	✅	5s	5s	+0%
`sign_verify_pre_hash_internal`	✅	5s	6s	-17%
`unpack_hints`	✅	5s	4s	+25%
`fqscale`	✅	4s	3s	+33%
`keccak_finalize`	✅	4s	3s	+33%
`keccak_init`	✅	4s	2s	+100%
`keccakf1600x4_permute`	✅	4s	2s	+100%
`mld_ct_abs_i32`	✅	4s	2s	+100%
`mld_ct_cmask_nonzero_u32`	✅	4s	3s	+33%
`mld_value_barrier_i64`	✅	4s	1s	+300%
`mld_value_barrier_u32`	✅	4s	4s	+0%
`pack_sk`	✅	4s	3s	+33%
`poly_make_hint`	✅	4s	3s	+33%
`poly_ntt_c`	✅	4s	3s	+33%
`poly_shiftl`	✅	4s	4s	+0%
`poly_use_hint_c`	✅	4s	4s	+0%
`polyt1_pack`	✅	4s	4s	+0%
`polyt1_unpack`	✅	4s	5s	-20%
`polyveck_chknorm`	✅	4s	4s	+0%
`polyvecl_pointwise_acc_montgomery`	✅	4s	4s	+0%
`polyz_unpack_c`	✅	4s	3s	+33%
`rej_eta`	✅	4s	2s	+100%
`sign_verify`	✅	4s	5s	-20%
`decompose`	✅	3s	2s	+50%
`keccakf1600_xor_bytes (big endian)`	✅	3s	3s	+0%
`keccakf1600x4_xor_bytes`	✅	3s	2s	+50%
`make_hint`	✅	3s	2s	+50%
`mld_ct_get_optblocker_i64`	✅	3s	1s	+200%
`mld_ct_get_optblocker_u32`	✅	3s	2s	+50%
`mld_keccakf1600_extract_bytes`	✅	3s	3s	+0%
`mld_value_barrier_u8`	✅	3s	2s	+50%
`pack_pk`	✅	3s	4s	-25%
`pack_sig_z`	✅	3s	4s	-25%
`poly_caddq`	✅	3s	5s	-40%
`poly_caddq_native`	✅	3s	2s	+50%
`poly_caddq_native_aarch64`	✅	3s	4s	-25%
`poly_chknorm_native`	✅	3s	5s	-40%
`poly_decompose`	✅	3s	4s	-25%
`poly_decompose_native`	✅	3s	4s	-25%
`poly_invntt_tomont`	✅	3s	5s	-40%
`poly_reduce`	✅	3s	2s	+50%
`poly_sub`	✅	3s	2s	+50%
`polyveck_pack_eta`	✅	3s	2s	+50%
`polyveck_pack_t0`	✅	3s	3s	+0%
`polyveck_unpack_eta`	✅	3s	3s	+0%
`polyvecl_pack_eta`	✅	3s	4s	-25%
`polyvecl_uniform_gamma1`	✅	3s	2s	+50%
`polyvecl_unpack_z`	✅	3s	3s	+0%
`polyw1_pack`	✅	3s	3s	+0%
`polyz_pack`	✅	3s	3s	+0%
`rej_eta_c`	✅	3s	3s	+0%
`shake128_finalize`	✅	3s	2s	+50%
`shake128_init`	✅	3s	3s	+0%
`shake128x4_absorb_once`	✅	3s	2s	+50%
`shake128x4_squeezeblocks`	✅	3s	3s	+0%
`shake256`	✅	3s	3s	+0%
`shake256_finalize`	✅	3s	2s	+50%
`sign_keypair`	✅	3s	4s	-25%
`sign_signature`	✅	3s	5s	-40%
`sign_signature_extmu`	✅	3s	5s	-40%
`sign_signature_pre_hash_internal`	✅	3s	5s	-40%
`sign_signature_pre_hash_shake256`	✅	3s	5s	-40%
`sign_verify_extmu`	✅	3s	3s	+0%
`sign_verify_pre_hash_shake256`	✅	3s	3s	+0%
`unpack_pk`	✅	3s	5s	-40%
`use_hint`	✅	3s	3s	+0%
`caddq`	✅	2s	4s	-50%
`keccak_squeeze`	✅	2s	2s	+0%
`keccakf1600_xor_bytes`	✅	2s	4s	-50%
`keccakf1600x4_extract_bytes`	✅	2s	2s	+0%
`mld_ct_cmask_nonzero_u8`	✅	2s	1s	+100%
`mld_ct_get_optblocker_u8`	✅	2s	2s	+0%
`mld_prepare_domain_separation_prefix`	✅	2s	4s	-50%
`ntt_native_aarch64`	✅	2s	3s	-33%
`ntt_native_x86_64`	✅	2s	5s	-60%
`pointwise_native_aarch64`	✅	2s	-	new
`poly_chknorm`	✅	2s	1s	+100%
`poly_chknorm_native_aarch64`	✅	2s	3s	-33%
`poly_invntt_tomont_native`	✅	2s	3s	-33%
`poly_ntt`	✅	2s	3s	-33%
`poly_ntt_native`	✅	2s	4s	-50%
`poly_pointwise_montgomery_native`	✅	2s	3s	-33%
`poly_use_hint`	✅	2s	3s	-33%
`poly_use_hint_native`	✅	2s	3s	-33%
`polyvecl_permute_bitrev_to_custom`	✅	2s	3s	-33%
`polyvecl_unpack_eta`	✅	2s	4s	-50%
`polyz_unpack`	✅	2s	4s	-50%
`polyz_unpack_native`	✅	2s	3s	-33%
`power2round`	✅	2s	1s	+100%
`reduce32`	✅	2s	1s	+100%
`shake128_release`	✅	2s	2s	+0%
`shake128_squeeze`	✅	2s	3s	-33%
`shake256_squeeze`	✅	2s	3s	-33%
`shake256x4_absorb_once`	✅	2s	5s	-60%
`shake256x4_squeezeblocks`	✅	2s	2s	+0%
`unpack_sig`	✅	2s	4s	-50%
`mld_ct_cmask_neg_i32`	✅	1s	1s	+0%
`mld_ct_sel_int32`	✅	1s	2s	-50%
`montgomery_reduce`	✅	1s	4s	-75%
`polyveck_pack_w1`	✅	1s	2s	-50%
`polyvecl_uniform_gamma1_serial`	✅	1s	5s	-80%
`shake128_absorb`	✅	1s	2s	-50%
`shake256_absorb`	✅	1s	3s	-67%
`shake256_release`	✅	1s	3s	-67%
`sys_check_capability`	✅	1s	4s	-75%

oqs-bot · 2026-04-01T22:48:10Z

CBMC Results (ML-DSA-87)

Full Results (181 proofs)

Proof	Status	Current	Previous	Change
`TOTAL`	✅	2609s	2688s	-2.9%
`polyvecl_pointwise_acc_montgomery_c`	✅	290s	310s	-6%
`mld_attempt_signature_generation`	✅	234s	239s	-2%
`sign_verify_internal`	✅	213s	215s	-1%
`polyvec_matrix_expand`	✅	191s	191s	+0%
`poly_pointwise_montgomery_c`	✅	174s	174s	+0%
`rej_uniform_native`	✅	150s	153s	-2%
`mld_invntt_layer`	✅	97s	101s	-4%
`polyvec_matrix_expand_serial`	✅	81s	82s	-1%
`mld_ct_memcmp`	✅	79s	84s	-6%
`polyveck_decompose`	✅	60s	62s	-3%
`mld_ntt_layer`	✅	58s	61s	-5%
`polymat_permute_bitrev_to_custom`	✅	49s	47s	+4%
`sign_signature_internal`	✅	40s	39s	+3%
`mld_compute_t0_t1_tr_from_sk_components`	✅	24s	26s	-8%
`poly_chknorm_c`	✅	23s	22s	+5%
`rej_uniform`	✅	21s	24s	-12%
`poly_uniform_4x`	✅	19s	17s	+12%
`fqmul`	✅	18s	22s	-18%
`polyeta_unpack`	✅	18s	18s	+0%
`poly_uniform_eta_4x`	✅	16s	17s	-6%
`polyt0_unpack`	✅	16s	15s	+7%
`rej_uniform_c`	✅	16s	15s	+7%
`keccakf1600x4_permute_native`	✅	14s	14s	+0%
`poly_add`	✅	14s	14s	+0%
`polyveck_reduce`	✅	13s	10s	+30%
`mld_ntt_butterfly_block`	✅	12s	14s	-14%
`mld_polyvecl_permute_bitrev_to_custom_native`	✅	12s	12s	+0%
`polyvec_matrix_pointwise_montgomery`	✅	12s	14s	-14%
`keccak_absorb_once_x4`	✅	11s	11s	+0%
`polyveck_use_hint`	✅	11s	10s	+10%
`polyveck_power2round`	✅	10s	10s	+0%
`poly_invntt_tomont_c`	✅	9s	8s	+12%
`polyveck_add`	✅	9s	12s	-25%
`polyvecl_ntt`	✅	9s	12s	-25%
`polyz_unpack_c`	✅	9s	9s	+0%
`sign_pk_from_sk`	✅	9s	9s	+0%
`unpack_sk`	✅	9s	8s	+12%
`keccakf1600_permute_native`	✅	8s	9s	-11%
`mld_check_pct`	✅	8s	7s	+14%
`mld_compute_pack_z`	✅	8s	5s	+60%
`mld_sample_s1_s2_serial`	✅	8s	7s	+14%
`poly_decompose_c`	✅	8s	8s	+0%
`polyt0_pack`	✅	8s	2s	+300%
`polyveck_pointwise_poly_montgomery_s2`	✅	8s	7s	+14%
`polyveck_shiftl`	✅	8s	6s	+33%
`sign_keypair_internal`	✅	8s	6s	+33%
`sign_verify_pre_hash_shake256`	✅	8s	4s	+100%
`keccak_absorb`	✅	7s	9s	-22%
`keccakf1600_permute`	✅	7s	7s	+0%
`mld_sample_s1_s2`	✅	7s	6s	+17%
`poly_uniform_gamma1_4x`	✅	7s	5s	+40%
`polyveck_caddq`	✅	7s	10s	-30%
`polyveck_make_hint`	✅	7s	6s	+17%
`polyveck_pointwise_poly_montgomery`	✅	7s	7s	+0%
`polyveck_pointwise_poly_montgomery_t0`	✅	7s	6s	+17%
`sign`	✅	7s	7s	+0%
`sign_signature_pre_hash_shake256`	✅	7s	7s	+0%
`keccak_squeezeblocks_x4`	✅	6s	5s	+20%
`mld_prepare_domain_separation_prefix`	✅	6s	5s	+20%
`poly_caddq_c`	✅	6s	8s	-25%
`poly_invntt_tomont_native`	✅	6s	1s	+500%
`polyveck_ntt`	✅	6s	9s	-33%
`polyvecl_uniform_gamma1_serial`	✅	6s	4s	+50%
`polyvecl_unpack_z`	✅	6s	3s	+100%
`sign_open`	✅	6s	6s	+0%
`sign_verify`	✅	6s	5s	+20%
`sign_verify_extmu`	✅	6s	3s	+100%
`unpack_hints`	✅	6s	6s	+0%
`make_hint`	✅	5s	3s	+67%
`poly_invntt_tomont`	✅	5s	5s	+0%
`polyveck_pack_eta`	✅	5s	2s	+150%
`polyveck_sub`	✅	5s	7s	-29%
`polyveck_unpack_eta`	✅	5s	5s	+0%
`shake128_absorb`	✅	5s	4s	+25%
`shake256x4_absorb_once`	✅	5s	2s	+150%
`sign_verify_pre_hash_internal`	✅	5s	4s	+25%
`unpack_pk`	✅	5s	3s	+67%
`keccak_squeeze`	✅	4s	4s	+0%
`mld_ct_get_optblocker_u32`	✅	4s	3s	+33%
`mld_h`	✅	4s	6s	-33%
`mld_value_barrier_u32`	✅	4s	5s	-20%
`pack_sig_c_h`	✅	4s	3s	+33%
`pack_sk`	✅	4s	4s	+0%
`pointwise_native_x86_64`	✅	4s	-	new
`poly_caddq_native`	✅	4s	6s	-33%
`poly_challenge`	✅	4s	4s	+0%
`poly_chknorm_native`	✅	4s	4s	+0%
`poly_ntt`	✅	4s	5s	-20%
`poly_ntt_native`	✅	4s	2s	+100%
`poly_power2round`	✅	4s	5s	-20%
`poly_sub`	✅	4s	5s	-20%
`poly_uniform_eta`	✅	4s	4s	+0%
`polyveck_invntt_tomont`	✅	4s	6s	-33%
`polyveck_unpack_t0`	✅	4s	5s	-20%
`polyvecl_uniform_gamma1`	✅	4s	4s	+0%
`polyz_pack`	✅	4s	4s	+0%
`reduce32`	✅	4s	2s	+100%
`rej_eta_c`	✅	4s	3s	+33%
`shake128_init`	✅	4s	2s	+100%
`shake128x4_absorb_once`	✅	4s	2s	+100%
`shake256_absorb`	✅	4s	4s	+0%
`sign_signature_pre_hash_internal`	✅	4s	4s	+0%
`sys_check_capability`	✅	4s	3s	+33%
`fqscale`	✅	3s	4s	-25%
`intt_native_x86_64`	✅	3s	3s	+0%
`keccakf1600x4_extract_bytes`	✅	3s	1s	+200%
`keccakf1600x4_permute`	✅	3s	3s	+0%
`keccakf1600x4_xor_bytes`	✅	3s	5s	-40%
`mld_ct_abs_i32`	✅	3s	3s	+0%
`mld_ct_cmask_neg_i32`	✅	3s	2s	+50%
`mld_ct_cmask_nonzero_u32`	✅	3s	6s	-50%
`mld_ct_get_optblocker_i64`	✅	3s	3s	+0%
`mld_keccakf1600_extract_bytes`	✅	3s	2s	+50%
`montgomery_reduce`	✅	3s	4s	-25%
`ntt_native_aarch64`	✅	3s	4s	-25%
`ntt_native_x86_64`	✅	3s	4s	-25%
`pointwise_native_aarch64`	✅	3s	-	new
`poly_chknorm`	✅	3s	4s	-25%
`poly_decompose_native`	✅	3s	4s	-25%
`poly_make_hint`	✅	3s	5s	-40%
`poly_pointwise_montgomery`	✅	3s	2s	+50%
`poly_pointwise_montgomery_native`	✅	3s	5s	-40%
`poly_reduce`	✅	3s	4s	-25%
`poly_uniform_gamma1`	✅	3s	2s	+50%
`poly_use_hint`	✅	3s	2s	+50%
`poly_use_hint_c`	✅	3s	3s	+0%
`polyt1_pack`	✅	3s	3s	+0%
`polyveck_chknorm`	✅	3s	3s	+0%
`polyveck_pack_w1`	✅	3s	5s	-40%
`polyvecl_chknorm`	✅	3s	4s	-25%
`polyvecl_pack_eta`	✅	3s	2s	+50%
`polyvecl_pointwise_acc_montgomery`	✅	3s	4s	-25%
`polyvecl_unpack_eta`	✅	3s	7s	-57%
`polyw1_pack`	✅	3s	3s	+0%
`polyz_unpack_native`	✅	3s	3s	+0%
`power2round`	✅	3s	3s	+0%
`shake128_squeeze`	✅	3s	3s	+0%
`shake256_init`	✅	3s	3s	+0%
`shake256_squeeze`	✅	3s	4s	-25%
`shake256x4_squeezeblocks`	✅	3s	4s	-25%
`sign_keypair`	✅	3s	5s	-40%
`sign_signature`	✅	3s	7s	-57%
`sign_signature_extmu`	✅	3s	4s	-25%
`unpack_sig`	✅	3s	4s	-25%
`caddq`	✅	2s	4s	-50%
`decompose`	✅	2s	3s	-33%
`keccak_finalize`	✅	2s	4s	-50%
`keccak_init`	✅	2s	2s	+0%
`keccakf1600_extract_bytes (big endian)`	✅	2s	3s	-33%
`keccakf1600_xor_bytes (big endian)`	✅	2s	2s	+0%
`mld_ct_cmask_nonzero_u8`	✅	2s	4s	-50%
`mld_ct_get_optblocker_u8`	✅	2s	3s	-33%
`mld_ct_sel_int32`	✅	2s	1s	+100%
`mld_value_barrier_i64`	✅	2s	5s	-60%
`mld_value_barrier_u8`	✅	2s	2s	+0%
`pack_pk`	✅	2s	4s	-50%
`pack_sig_z`	✅	2s	5s	-60%
`poly_caddq`	✅	2s	5s	-60%
`poly_caddq_native_aarch64`	✅	2s	3s	-33%
`poly_chknorm_native_aarch64`	✅	2s	3s	-33%
`poly_decompose`	✅	2s	2s	+0%
`poly_ntt_c`	✅	2s	4s	-50%
`poly_shiftl`	✅	2s	5s	-60%
`poly_uniform`	✅	2s	4s	-50%
`poly_use_hint_native`	✅	2s	3s	-33%
`polyeta_pack`	✅	2s	3s	-33%
`polyt1_unpack`	✅	2s	3s	-33%
`polyveck_pack_t0`	✅	2s	2s	+0%
`polyvecl_permute_bitrev_to_custom`	✅	2s	3s	-33%
`polyvecl_pointwise_acc_montgomery_native`	✅	2s	3s	-33%
`polyz_unpack`	✅	2s	5s	-60%
`rej_eta`	✅	2s	4s	-50%
`rej_eta_native`	✅	2s	8s	-75%
`shake128_finalize`	✅	2s	3s	-33%
`shake128_release`	✅	2s	3s	-33%
`shake128x4_squeezeblocks`	✅	2s	1s	+100%
`shake256_release`	✅	2s	2s	+0%
`use_hint`	✅	2s	2s	+0%
`keccakf1600_xor_bytes`	✅	1s	1s	+0%
`shake256`	✅	1s	4s	-75%
`shake256_finalize`	✅	1s	3s	-67%

mkannwischer

Thanks @jakemas - the HOL-Light proofs look good to me. I checked the spec and everything makes sense to me.
The CBMC contracts, however, do not match that - please update.

Do you have an intuition why the proof for 44 is much slower than the one for 65? That does not seem to make sense to me. Would be good to improve this, but that could also be done in a follow-up PR.

mkannwischer · 2026-04-02T07:08:16Z

dev/aarch64_clean/src/arith_native_aarch64.h

+  requires(array_abs_bound(a, 0, 4 * MLDSA_N, 75423753))
+  requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))


This does not match the HOL-Light pre-conditions:

(!i. i < 1024 ==> abs(ival(x i)) <= &8380416) /\ (!i. i < 1024 ==> abs(ival(y i)) <= &75423752) /\

Thank you, fixed!

mkannwischer · 2026-04-02T07:12:03Z

mldsa/src/native/x86_64/src/arith_native_x86_64.h

+  requires(array_abs_bound(a, 0, 4 * MLDSA_N, 75423753))
+  requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))


same in the x86 backend

Thank you, fixed!

jakemas · 2026-04-02T20:07:54Z

Do you have an intuition why the proof for 44 is much slower than the one for 65? That does not seem to make sense to me. Would be good to improve this, but that could also be done in a follow-up PR.

Good catch. The bounds phase (proving congruence + boundedness for each of the 256 output coefficients) was using a slow approach for l4 that repeatedly calls find_term to search the goal term for ival(word_mul ...) subterms and then proves each rewrite via ARITH_TAC. This is quadratic in the goal term size.

I found this when developing the l7 proofs and optimized the approach there (as they were too slow without it), pre-computing all the ival_mul rewrite theorems into an array upfront then indexing directly, but didn't backport it to the smaller sizes. I've now applied the same optimization to the l4 proofs (both architectures) and also to the aarch64 l5/l7 which were also unoptimized. Should see a significant speedup.

Times improved 3-4x for L4:

Proof	Before	After	Speedup
aarch64 l4	2h21m	42m	3.4x
aarch64 l5	49m	47m	~same
aarch64 l7	1h29m	1h29m	~same
x86_64 l4	2h23m	35m	4.1x
x86_64 l5	47m	49m	~same
x86_64 l7	1h32m	~1h31m	~same

mkannwischer · 2026-04-03T03:20:13Z

dev/aarch64_clean/src/arith_native_aarch64.h

+  /* check-magic: off */
+  requires(array_abs_bound(a, 0, 4 * MLDSA_N, 8380417))
+  requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))
+  /* check-magic: on */
+  assigns(memory_slice(r, sizeof(int32_t) * MLDSA_N))
+  /* check-magic: off */
+  ensures(array_abs_bound(r, 0, MLDSA_N, 8380417))
+  /* check-magic: on */


nit: turning off check-magic once instead of twice would be less distracting.

This is also present in #1006. Maybe you can still change that in both before we merge.

Thank you, fixed in both this and #1006

mkannwischer

Thanks @jakemas - this looks great to me. Thank you for the speed improvements for the L4 proof - that is greatly appreciated.
I think this is ready to be merged.

@hanno-becker, could you please double check the HOL-Light specs?

Port the ML-DSA pointwise polynomial multiplication (Montgomery form) and its HOL Light proofs of correctness from s2n-bignum to mldsa-native, for both AArch64 (NEON) and x86_64 (AVX2). The proofs verify the assembly implementations at the object-code level, showing that output coefficients are congruent to the pointwise product of the inputs modulo Q=8380417, with bounded output coefficients (|output| <= Q-1). For AArch64, a constant-time and memory safety proof is also included. Ported from s2n-bignum commit ca6ec31a225a. Signed-off-by: Jake Massimo <jakemas@amazon.com>

Port the ML-DSA pointwise multiplication-accumulation (l=4,5,7) and their HOL Light proofs of correctness from s2n-bignum to mldsa-native, for both AArch64 (NEON) and x86_64 (AVX2). Includes constant-time and memory safety proofs for both architectures. Ported from s2n-bignum PR #373. Signed-off-by: Jake Massimo <jakemas@amazon.com>

hanno-becker

Thanks a lot @jakemas, this is great work 🎉

Unless I'm missing something, we still need CBMC proofs for the native versions of pointwise_acc_montgomery?

Hanno is right: The CBMC proofs still need to be added.

jakemas requested a review from a team as a code owner April 1, 2026 21:23

jakemas marked this pull request as draft April 1, 2026 21:26

jakemas force-pushed the jakemas/hol-light-pointwise-acc branch 5 times, most recently from c96a621 to b9b5908 Compare April 1, 2026 22:05

jakemas marked this pull request as ready for review April 2, 2026 01:08

mkannwischer self-assigned this Apr 2, 2026

mkannwischer requested changes Apr 2, 2026

View reviewed changes

jakemas force-pushed the jakemas/hol-light-pointwise-acc branch 2 times, most recently from ba6edd2 to 87d0267 Compare April 2, 2026 19:56

mkannwischer reviewed Apr 3, 2026

View reviewed changes

mkannwischer previously approved these changes Apr 3, 2026

View reviewed changes

mkannwischer assigned hanno-becker and unassigned mkannwischer Apr 3, 2026

jakemas force-pushed the jakemas/hol-light-pointwise-acc branch from 87d0267 to 21c0cbc Compare April 3, 2026 17:29

jakemas added 2 commits April 5, 2026 06:10

hanno-becker force-pushed the jakemas/hol-light-pointwise-acc branch from 21c0cbc to 554a957 Compare April 5, 2026 05:10

hanno-becker requested changes Apr 5, 2026

View reviewed changes

		requires(array_abs_bound(a, 0, 4 * MLDSA_N, 75423753))
		requires(array_abs_bound(b, 0, 4 * MLDSA_N, 75423753))

Conversation

jakemas commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Dependencies

Uh oh!

oqs-bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CBMC Results (ML-DSA-44)

Uh oh!

oqs-bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CBMC Results (ML-DSA-65)

Uh oh!

oqs-bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CBMC Results (ML-DSA-87)

Uh oh!

mkannwischer left a comment

Choose a reason for hiding this comment

Uh oh!

mkannwischer Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

jakemas Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

mkannwischer Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

jakemas Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

jakemas commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkannwischer Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

mkannwischer Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

jakemas Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

mkannwischer left a comment

Choose a reason for hiding this comment

Uh oh!

hanno-becker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jakemas commented Apr 1, 2026 •

edited

Loading

oqs-bot commented Apr 1, 2026 •

edited

Loading

oqs-bot commented Apr 1, 2026 •

edited

Loading

oqs-bot commented Apr 1, 2026 •

edited

Loading

jakemas commented Apr 2, 2026 •

edited

Loading