Skip to content

Add HOL Light pointwise multiplication proofs for AArch64 and x86_64#1006

Open
jakemas wants to merge 1 commit intomainfrom
jakemas/hol-light-pointwise
Open

Add HOL Light pointwise multiplication proofs for AArch64 and x86_64#1006
jakemas wants to merge 1 commit intomainfrom
jakemas/hol-light-pointwise

Conversation

@jakemas
Copy link
Copy Markdown
Contributor

@jakemas jakemas commented Mar 31, 2026

Summary

Resolves:

Port the ML-DSA pointwise polynomial multiplication (Montgomery form) and its HOL Light proofs from s2n-bignum to mldsa-native for both AArch64 (NEON) and x86_64 (AVX2).

  • AArch64: Correctness proof (MLDSA_POINTWISE_CORRECT), subroutine form (MLDSA_POINTWISE_SUBROUTINE_CORRECT), and constant-time/memory safety (MLDSA_POINTWISE_SUBROUTINE_SAFE)
  • x86_64: Correctness proof (MLDSA_POINTWISE_CORRECT), subroutine form with and without IBT (MLDSA_POINTWISE_SUBROUTINE_CORRECT, MLDSA_POINTWISE_NOIBT_SUBROUTINE_CORRECT), and constant-time/memory safety (MLDSA_POINTWISE_SUBROUTINE_SAFE)
  • Shared specifications added to common/mldsa_specs.ml: mldsa_pointwise, mldsa_pointwise_montred, arm_mldsa_pointwise_montred, ARM_MLDSA_MONTRED_EQ, CONGBOUND_MLDSA_POINTWISE_MONTRED

Includes CBMC proofs.

The proofs verify the assembly at the object-code level, showing output coefficients are congruent to the pointwise product of the inputs modulo Q=8380417 with bounded outputs (|output| ≤ Q−1).

@jakemas jakemas requested a review from a team as a code owner March 31, 2026 18:48
@jakemas jakemas force-pushed the jakemas/hol-light-pointwise branch from ae648e4 to 4d3fefe Compare March 31, 2026 18:52
@jakemas jakemas marked this pull request as draft March 31, 2026 19:07
@jakemas jakemas force-pushed the jakemas/hol-light-pointwise branch from 4d3fefe to 86eb0c4 Compare March 31, 2026 19:35
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 31, 2026

CBMC Results (ML-DSA-44)

Full Results (181 proofs)
Proof Status Current Previous Change
**TOTAL** 1955s 2072s -5.6%
mld_attempt_signature_generation 239s 253s -6%
polyvecl_pointwise_acc_montgomery_c 185s 216s -14%
sign_verify_internal 173s 184s -6%
poly_pointwise_montgomery_c 137s 155s -12%
rej_uniform_native 134s 145s -8%
mld_invntt_layer 83s 88s -6%
mld_ct_memcmp 70s 78s -10%
mld_ntt_layer 50s 56s -11%
sign_signature_internal 21s 18s +17%
rej_uniform 20s 21s -5%
fqmul 19s 21s -10%
poly_chknorm_c 19s 20s -5%
polyvec_matrix_expand 17s 19s -11%
poly_uniform_eta_4x 16s 18s -11%
polyeta_unpack 16s 16s +0%
keccakf1600x4_permute_native 15s 14s +7%
poly_uniform_4x 15s 17s -12%
polymat_permute_bitrev_to_custom 15s 17s -12%
polyvec_matrix_expand_serial 15s 16s -6%
mld_compute_t0_t1_tr_from_sk_components 14s 15s -7%
rej_uniform_c 14s 16s -12%
mld_ntt_butterfly_block 13s 12s +8%
polyt0_unpack 13s 14s -7%
polyveck_power2round 13s 13s +0%
polyvec_matrix_pointwise_montgomery 12s 12s +0%
poly_add 11s 12s -8%
keccak_absorb_once_x4 10s 12s -17%
keccakf1600_permute_native 10s 8s +25%
mld_polyvecl_permute_bitrev_to_custom_native 10s 10s +0%
polyz_unpack_c 10s 12s -17%
keccakf1600_permute 9s 8s +12%
mld_check_pct 9s 6s +50%
sign 9s 11s -18%
keccak_absorb 8s 7s +14%
polyveck_add 8s 9s -11%
keccak_squeezeblocks_x4 7s 7s +0%
poly_use_hint_c 7s 3s +133%
polyveck_use_hint 7s 7s +0%
polyvecl_ntt 7s 5s +40%
sign_verify_pre_hash_shake256 7s 5s +40%
unpack_sk 7s 8s -12%
mld_sample_s1_s2 6s 7s -14%
pointwise_native_aarch64 6s - new
poly_invntt_tomont_c 6s 6s +0%
poly_uniform_eta 6s 4s +50%
polyveck_caddq 6s 6s +0%
polyveck_pointwise_poly_montgomery_t0 6s 4s +50%
polyveck_shiftl 6s 6s +0%
polyvecl_unpack_eta 6s 2s +200%
sign_signature_pre_hash_internal 6s 7s -14%
unpack_hints 6s 6s +0%
caddq 5s 5s +0%
mld_compute_pack_z 5s 4s +25%
pointwise_native_x86_64 5s - new
poly_caddq_c 5s 4s +25%
poly_challenge 5s 5s +0%
poly_power2round 5s 4s +25%
poly_uniform 5s 4s +25%
poly_use_hint_native 5s 2s +150%
polyveck_decompose 5s 4s +25%
polyveck_invntt_tomont 5s 4s +25%
polyveck_ntt 5s 5s +0%
polyveck_pointwise_poly_montgomery_s2 5s 4s +25%
polyveck_sub 5s 6s -17%
polyvecl_pointwise_acc_montgomery_native 5s 4s +25%
shake128_release 5s 4s +25%
sign_keypair_internal 5s 6s -17%
sign_pk_from_sk 5s 5s +0%
sign_verify 5s 6s -17%
fqscale 4s 2s +100%
keccakf1600_xor_bytes 4s 1s +300%
keccakf1600x4_extract_bytes 4s 2s +100%
mld_ct_get_optblocker_i64 4s 5s -20%
mld_h 4s 2s +100%
mld_prepare_domain_separation_prefix 4s 2s +100%
montgomery_reduce 4s 3s +33%
ntt_native_aarch64 4s 4s +0%
pack_pk 4s 3s +33%
poly_caddq 4s 5s -20%
poly_caddq_native_aarch64 4s 3s +33%
poly_chknorm_native 4s 5s -20%
poly_chknorm_native_aarch64 4s 4s +0%
poly_invntt_tomont_native 4s 4s +0%
poly_ntt 4s 5s -20%
poly_pointwise_montgomery_native 4s 2s +100%
poly_sub 4s 3s +33%
poly_uniform_gamma1 4s 2s +100%
poly_uniform_gamma1_4x 4s 5s -20%
polyt0_pack 4s 5s -20%
polyveck_chknorm 4s 4s +0%
polyveck_pack_t0 4s 4s +0%
polyveck_pack_w1 4s 3s +33%
polyveck_reduce 4s 5s -20%
polyveck_unpack_eta 4s 6s -33%
polyvecl_unpack_z 4s 3s +33%
polyz_unpack_native 4s 3s +33%
power2round 4s 3s +33%
rej_eta 4s 3s +33%
rej_eta_native 4s 5s -20%
sign_verify_extmu 4s 6s -33%
sign_verify_pre_hash_internal 4s 5s -20%
sys_check_capability 4s 3s +33%
unpack_pk 4s 3s +33%
use_hint 4s 4s +0%
decompose 3s 2s +50%
keccak_finalize 3s 2s +50%
keccakf1600_extract_bytes (big endian) 3s 3s +0%
keccakf1600x4_permute 3s 3s +0%
make_hint 3s 2s +50%
mld_ct_abs_i32 3s 2s +50%
mld_ct_cmask_nonzero_u8 3s 3s +0%
mld_ct_get_optblocker_u32 3s 2s +50%
mld_keccakf1600_extract_bytes 3s 2s +50%
mld_sample_s1_s2_serial 3s 5s -40%
mld_value_barrier_u32 3s 3s +0%
mld_value_barrier_u8 3s 3s +0%
ntt_native_x86_64 3s 3s +0%
pack_sig_c_h 3s 5s -40%
pack_sig_z 3s 2s +50%
pack_sk 3s 4s -25%
poly_caddq_native 3s 6s -50%
poly_chknorm 3s 3s +0%
poly_decompose_native 3s 4s -25%
poly_ntt_c 3s 4s -25%
poly_ntt_native 3s 4s -25%
poly_pointwise_montgomery 3s 5s -40%
poly_reduce 3s 3s +0%
poly_shiftl 3s 3s +0%
polyt1_pack 3s 3s +0%
polyt1_unpack 3s 3s +0%
polyveck_make_hint 3s 3s +0%
polyveck_pack_eta 3s 5s -40%
polyveck_pointwise_poly_montgomery 3s 5s -40%
polyvecl_chknorm 3s 3s +0%
polyvecl_pack_eta 3s 4s -25%
polyvecl_permute_bitrev_to_custom 3s 4s -25%
polyvecl_uniform_gamma1 3s 2s +50%
polyvecl_uniform_gamma1_serial 3s 3s +0%
polyw1_pack 3s 4s -25%
rej_eta_c 3s 3s +0%
shake128_finalize 3s 3s +0%
shake128_init 3s 2s +50%
shake128_squeeze 3s 2s +50%
shake128x4_absorb_once 3s 2s +50%
shake256 3s 2s +50%
sign_signature 3s 5s -40%
sign_signature_pre_hash_shake256 3s 7s -57%
unpack_sig 3s 2s +50%
intt_native_x86_64 2s 3s -33%
keccak_init 2s 1s +100%
keccakf1600_xor_bytes (big endian) 2s 3s -33%
keccakf1600x4_xor_bytes 2s 2s +0%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 3s -33%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_value_barrier_i64 2s 2s +0%
poly_decompose 2s 5s -60%
poly_decompose_c 2s 2s +0%
poly_make_hint 2s 2s +0%
poly_use_hint 2s 2s +0%
polyveck_unpack_t0 2s 4s -50%
polyvecl_pointwise_acc_montgomery 2s 2s +0%
polyz_pack 2s 4s -50%
polyz_unpack 2s 3s -33%
reduce32 2s 3s -33%
shake128_absorb 2s 1s +100%
shake128x4_squeezeblocks 2s 5s -60%
shake256_absorb 2s 2s +0%
shake256_init 2s 1s +100%
shake256x4_absorb_once 2s 2s +0%
shake256x4_squeezeblocks 2s 4s -50%
sign_keypair 2s 4s -50%
sign_open 2s 4s -50%
sign_signature_extmu 2s 4s -50%
keccak_squeeze 1s 3s -67%
mld_ct_sel_int32 1s 4s -75%
poly_invntt_tomont 1s 2s -50%
polyeta_pack 1s 3s -67%
shake256_finalize 1s 3s -67%
shake256_release 1s 2s -50%
shake256_squeeze 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 31, 2026

CBMC Results (ML-DSA-65)

Full Results (181 proofs)
Proof Status Current Previous Change
**TOTAL** 2403s 2509s -4.2%
sign_verify_internal 322s 346s -7%
mld_attempt_signature_generation 262s 285s -8%
polyvecl_pointwise_acc_montgomery_c 176s 191s -8%
poly_pointwise_montgomery_c 147s 166s -11%
rej_uniform_native 138s 149s -7%
polyvec_matrix_expand 121s 126s -4%
mld_invntt_layer 90s 98s -8%
mld_ct_memcmp 74s 80s -7%
polyvec_matrix_expand_serial 66s 72s -8%
mld_ntt_layer 56s 57s -2%
polymat_permute_bitrev_to_custom 31s 31s +0%
mld_compute_t0_t1_tr_from_sk_components 26s 27s -4%
sign_signature_internal 25s 27s -7%
rej_uniform 21s 20s +5%
poly_chknorm_c 19s 24s -21%
fqmul 18s 23s -22%
poly_uniform_4x 16s 16s +0%
poly_uniform_eta_4x 16s 17s -6%
rej_uniform_c 15s 17s -12%
polyt0_unpack 14s 13s +8%
polyveck_decompose 14s 13s +8%
keccakf1600x4_permute_native 13s 13s +0%
mld_ntt_butterfly_block 13s 12s +8%
polyvecl_chknorm 13s 12s +8%
polyvec_matrix_pointwise_montgomery 12s 10s +20%
poly_add 11s 11s +0%
poly_invntt_tomont_c 11s 7s +57%
keccak_absorb_once_x4 10s 11s -9%
polyveck_add 10s 9s +11%
polyveck_power2round 10s 12s -17%
polyveck_sub 10s 13s -23%
keccakf1600_permute 9s 9s +0%
polyveck_invntt_tomont 9s 8s +12%
keccakf1600_permute_native 8s 8s +0%
mld_check_pct 8s 8s +0%
poly_decompose_c 8s 8s +0%
polyveck_ntt 8s 10s -20%
polyveck_pointwise_poly_montgomery_t0 8s 8s +0%
polyveck_shiftl 8s 6s +33%
polyveck_use_hint 8s 11s -27%
sign_pk_from_sk 8s 8s +0%
sign_signature_extmu 8s 3s +167%
keccak_squeezeblocks_x4 7s 7s +0%
mld_compute_pack_z 7s 8s -12%
mld_polyvecl_permute_bitrev_to_custom_native 7s 8s -12%
polyveck_pointwise_poly_montgomery 7s 8s -12%
polyveck_pointwise_poly_montgomery_s2 7s 7s +0%
polyveck_reduce 7s 11s -36%
polyvecl_ntt 7s 8s -12%
sign 7s 6s +17%
unpack_sk 7s 8s -12%
mld_prepare_domain_separation_prefix 6s 3s +100%
mld_sample_s1_s2_serial 6s 7s -14%
poly_challenge 6s 5s +20%
poly_chknorm_native_aarch64 6s 4s +50%
poly_power2round 6s 6s +0%
poly_uniform 6s 3s +100%
poly_uniform_eta 6s 4s +50%
polyeta_unpack 6s 5s +20%
polyveck_chknorm 6s 3s +100%
polyveck_make_hint 6s 4s +50%
polyveck_unpack_eta 6s 5s +20%
polyvecl_unpack_z 6s 5s +20%
sign_verify 6s 4s +50%
decompose 5s 2s +150%
keccak_absorb 5s 11s -55%
keccakf1600x4_permute 5s 1s +400%
mld_h 5s 5s +0%
montgomery_reduce 5s 2s +150%
ntt_native_x86_64 5s 5s +0%
poly_pointwise_montgomery_native 5s 3s +67%
poly_shiftl 5s 2s +150%
poly_uniform_gamma1 5s 3s +67%
polyveck_caddq 5s 8s -38%
polyveck_unpack_t0 5s 3s +67%
polyvecl_pointwise_acc_montgomery_native 5s 6s -17%
shake128_squeeze 5s 1s +400%
shake256_init 5s 4s +25%
sign_keypair_internal 5s 7s -29%
sign_signature_pre_hash_shake256 5s 3s +67%
unpack_sig 5s 4s +25%
keccak_init 4s 2s +100%
keccakf1600_xor_bytes (big endian) 4s 2s +100%
make_hint 4s 4s +0%
mld_ct_cmask_neg_i32 4s 2s +100%
mld_ct_cmask_nonzero_u32 4s 4s +0%
mld_ct_cmask_nonzero_u8 4s 2s +100%
mld_sample_s1_s2 4s 5s -20%
mld_value_barrier_i64 4s 1s +300%
ntt_native_aarch64 4s 2s +100%
pack_sig_z 4s 2s +100%
pointwise_native_aarch64 4s - new
poly_caddq_c 4s 4s +0%
poly_caddq_native 4s 4s +0%
poly_chknorm_native 4s 3s +33%
poly_decompose_native 4s 5s -20%
poly_ntt 4s 2s +100%
poly_ntt_c 4s 6s -33%
poly_ntt_native 4s 3s +33%
poly_pointwise_montgomery 4s 4s +0%
poly_uniform_gamma1_4x 4s 5s -20%
poly_use_hint_native 4s 5s -20%
polyt0_pack 4s 5s -20%
polyveck_pack_eta 4s 2s +100%
polyvecl_pointwise_acc_montgomery 4s 3s +33%
polyz_unpack 4s 2s +100%
polyz_unpack_native 4s 2s +100%
reduce32 4s 3s +33%
rej_eta_c 4s 4s +0%
rej_eta_native 4s 5s -20%
shake128_release 4s 2s +100%
shake128x4_squeezeblocks 4s 2s +100%
shake256_absorb 4s 3s +33%
shake256_release 4s 1s +300%
sign_signature_pre_hash_internal 4s 3s +33%
sign_verify_extmu 4s 3s +33%
sys_check_capability 4s 3s +33%
unpack_hints 4s 4s +0%
use_hint 4s 4s +0%
intt_native_x86_64 3s 4s -25%
keccakf1600_extract_bytes (big endian) 3s 3s +0%
keccakf1600_xor_bytes 3s 3s +0%
keccakf1600x4_xor_bytes 3s 3s +0%
mld_ct_abs_i32 3s 2s +50%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_ct_sel_int32 3s 4s -25%
mld_keccakf1600_extract_bytes 3s 5s -40%
mld_value_barrier_u32 3s 2s +50%
pack_pk 3s 2s +50%
pack_sig_c_h 3s 4s -25%
pack_sk 3s 4s -25%
pointwise_native_x86_64 3s - new
poly_chknorm 3s 3s +0%
poly_decompose 3s 3s +0%
poly_invntt_tomont 3s 2s +50%
poly_invntt_tomont_native 3s 3s +0%
poly_sub 3s 2s +50%
poly_use_hint 3s 2s +50%
poly_use_hint_c 3s 5s -40%
polyt1_pack 3s 3s +0%
polyveck_pack_t0 3s 4s -25%
polyvecl_pack_eta 3s 2s +50%
polyvecl_permute_bitrev_to_custom 3s 3s +0%
polyvecl_uniform_gamma1 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyz_pack 3s 3s +0%
polyz_unpack_c 3s 3s +0%
rej_eta 3s 6s -50%
shake256_finalize 3s 1s +200%
sign_keypair 3s 7s -57%
sign_open 3s 2s +50%
sign_verify_pre_hash_shake256 3s 5s -40%
unpack_pk 3s 5s -40%
caddq 2s 5s -60%
fqscale 2s 2s +0%
keccak_squeeze 2s 5s -60%
keccakf1600x4_extract_bytes 2s 4s -50%
mld_ct_get_optblocker_i64 2s 4s -50%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_value_barrier_u8 2s 1s +100%
poly_caddq 2s 5s -60%
poly_caddq_native_aarch64 2s 4s -50%
poly_make_hint 2s 1s +100%
poly_reduce 2s 3s -33%
polyeta_pack 2s 2s +0%
polyveck_pack_w1 2s 5s -60%
polyvecl_unpack_eta 2s 3s -33%
polyw1_pack 2s 3s -33%
power2round 2s 3s -33%
shake128_absorb 2s 3s -33%
shake128_finalize 2s 2s +0%
shake128_init 2s 2s +0%
shake128x4_absorb_once 2s 2s +0%
shake256 2s 3s -33%
shake256_squeeze 2s 3s -33%
shake256x4_squeezeblocks 2s 4s -50%
sign_signature 2s 7s -71%
sign_verify_pre_hash_internal 2s 4s -50%
keccak_finalize 1s 2s -50%
polyt1_unpack 1s 3s -67%
shake256x4_absorb_once 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 31, 2026

CBMC Results (ML-DSA-87)

Full Results (181 proofs)
Proof Status Current Previous Change
**TOTAL** 2674s 2668s +0.2%
polyvecl_pointwise_acc_montgomery_c 311s 311s +0%
mld_attempt_signature_generation 233s 239s -3%
sign_verify_internal 222s 216s +3%
polyvec_matrix_expand 189s 195s -3%
poly_pointwise_montgomery_c 175s 176s -1%
rej_uniform_native 154s 158s -3%
mld_invntt_layer 99s 104s -5%
polyvec_matrix_expand_serial 84s 85s -1%
mld_ct_memcmp 82s 83s -1%
polyveck_decompose 61s 60s +2%
mld_ntt_layer 57s 56s +2%
polymat_permute_bitrev_to_custom 46s 47s -2%
sign_signature_internal 40s 42s -5%
mld_compute_t0_t1_tr_from_sk_components 25s 25s +0%
poly_chknorm_c 22s 24s -8%
fqmul 21s 21s +0%
rej_uniform 21s 24s -12%
polyeta_unpack 17s 20s -15%
polyt0_unpack 16s 15s +7%
poly_uniform_4x 15s 16s -6%
poly_uniform_eta_4x 15s 17s -12%
rej_uniform_c 15s 16s -6%
keccakf1600x4_permute_native 14s 15s -7%
polyvec_matrix_pointwise_montgomery 14s 11s +27%
mld_ntt_butterfly_block 13s 13s +0%
poly_add 13s 12s +8%
polyz_unpack_c 13s 8s +62%
mld_polyvecl_permute_bitrev_to_custom_native 12s 13s -8%
polyveck_use_hint 12s 9s +33%
keccakf1600_permute 11s 7s +57%
keccak_absorb 10s 7s +43%
keccak_absorb_once_x4 10s 9s +11%
polyveck_add 10s 9s +11%
polyveck_power2round 10s 8s +25%
polyveck_reduce 10s 10s +0%
polyvecl_ntt 10s 8s +25%
keccakf1600_permute_native 9s 7s +29%
poly_invntt_tomont_c 9s 9s +0%
polyveck_caddq 9s 11s -18%
unpack_hints 9s 6s +50%
unpack_sk 9s 6s +50%
mld_compute_pack_z 8s 6s +33%
mld_sample_s1_s2 8s 5s +60%
poly_decompose_c 8s 8s +0%
polyveck_ntt 8s 11s -27%
polyveck_pointwise_poly_montgomery 8s 5s +60%
polyveck_pointwise_poly_montgomery_t0 8s 9s -11%
polyveck_sub 8s 6s +33%
sign_pk_from_sk 8s 9s -11%
keccak_squeezeblocks_x4 7s 7s +0%
mld_check_pct 7s 9s -22%
polyveck_shiftl 7s 7s +0%
polyveck_unpack_eta 7s 4s +75%
mld_sample_s1_s2_serial 6s 6s +0%
poly_caddq_c 6s 6s +0%
polyveck_chknorm 6s 4s +50%
polyveck_pointwise_poly_montgomery_s2 6s 5s +20%
polyvecl_unpack_z 6s 3s +100%
rej_eta_c 6s 4s +50%
shake256_finalize 6s 2s +200%
sign_verify 6s 5s +20%
sign_verify_pre_hash_internal 6s 2s +200%
sign_verify_pre_hash_shake256 6s 6s +0%
decompose 5s 3s +67%
keccakf1600x4_xor_bytes 5s 3s +67%
mld_ct_cmask_nonzero_u32 5s 3s +67%
mld_h 5s 5s +0%
poly_caddq_native_aarch64 5s 2s +150%
poly_challenge 5s 5s +0%
poly_invntt_tomont_native 5s 2s +150%
poly_ntt 5s 2s +150%
poly_pointwise_montgomery_native 5s 4s +25%
poly_power2round 5s 8s -38%
poly_uniform_gamma1_4x 5s 4s +25%
poly_use_hint_c 5s 2s +150%
polyveck_invntt_tomont 5s 8s -38%
polyveck_make_hint 5s 5s +0%
polyveck_pack_eta 5s 4s +25%
polyvecl_chknorm 5s 4s +25%
polyvecl_permute_bitrev_to_custom 5s 3s +67%
polyvecl_pointwise_acc_montgomery 5s 4s +25%
polyvecl_pointwise_acc_montgomery_native 5s 5s +0%
polyw1_pack 5s 2s +150%
shake128_absorb 5s 3s +67%
shake256_init 5s 1s +400%
sign 5s 7s -29%
sign_keypair_internal 5s 5s +0%
sign_open 5s 2s +150%
sign_signature_extmu 5s 6s -17%
sign_signature_pre_hash_shake256 5s 4s +25%
keccak_finalize 4s 3s +33%
keccak_squeeze 4s 2s +100%
mld_ct_abs_i32 4s 2s +100%
mld_ct_cmask_neg_i32 4s 1s +300%
mld_value_barrier_u8 4s 2s +100%
pack_pk 4s 3s +33%
pack_sk 4s 2s +100%
pointwise_native_x86_64 4s - new
poly_chknorm 4s 2s +100%
poly_sub 4s 3s +33%
polyveck_pack_t0 4s 3s +33%
polyvecl_pack_eta 4s 4s +0%
polyvecl_uniform_gamma1 4s 4s +0%
polyvecl_uniform_gamma1_serial 4s 4s +0%
polyvecl_unpack_eta 4s 6s -33%
rej_eta 4s 4s +0%
rej_eta_native 4s 4s +0%
shake128_release 4s 3s +33%
shake256_release 4s 4s +0%
shake256x4_absorb_once 4s 2s +100%
sign_keypair 4s 7s -43%
sign_signature 4s 2s +100%
sign_signature_pre_hash_internal 4s 4s +0%
sign_verify_extmu 4s 2s +100%
unpack_sig 4s 4s +0%
caddq 3s 4s -25%
intt_native_x86_64 3s 3s +0%
keccak_init 3s 2s +50%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_xor_bytes 3s 4s -25%
mld_ct_get_optblocker_u32 3s 2s +50%
mld_ct_get_optblocker_u8 3s 2s +50%
mld_keccakf1600_extract_bytes 3s 2s +50%
mld_prepare_domain_separation_prefix 3s 4s -25%
mld_value_barrier_i64 3s 2s +50%
ntt_native_aarch64 3s 4s -25%
ntt_native_x86_64 3s 3s +0%
pack_sig_z 3s 4s -25%
poly_chknorm_native_aarch64 3s 6s -50%
poly_decompose 3s 2s +50%
poly_decompose_native 3s 5s -40%
poly_reduce 3s 3s +0%
poly_uniform_eta 3s 4s -25%
polyeta_pack 3s 4s -25%
polyt0_pack 3s 3s +0%
polyt1_pack 3s 4s -25%
polyveck_unpack_t0 3s 7s -57%
polyz_pack 3s 3s +0%
polyz_unpack 3s 2s +50%
power2round 3s 4s -25%
reduce32 3s 4s -25%
shake128x4_absorb_once 3s 3s +0%
shake256_absorb 3s 3s +0%
shake256x4_squeezeblocks 3s 4s -25%
sys_check_capability 3s 3s +0%
unpack_pk 3s 4s -25%
use_hint 3s 2s +50%
fqscale 2s 7s -71%
keccakf1600x4_extract_bytes 2s 5s -60%
keccakf1600x4_permute 2s 4s -50%
mld_ct_cmask_nonzero_u8 2s 2s +0%
mld_ct_get_optblocker_i64 2s 4s -50%
mld_value_barrier_u32 2s 2s +0%
montgomery_reduce 2s 2s +0%
pack_sig_c_h 2s 4s -50%
pointwise_native_aarch64 2s - new
poly_caddq 2s 4s -50%
poly_caddq_native 2s 3s -33%
poly_chknorm_native 2s 3s -33%
poly_invntt_tomont 2s 4s -50%
poly_make_hint 2s 3s -33%
poly_ntt_c 2s 4s -50%
poly_ntt_native 2s 4s -50%
poly_pointwise_montgomery 2s 6s -67%
poly_shiftl 2s 5s -60%
poly_uniform 2s 4s -50%
poly_uniform_gamma1 2s 5s -60%
poly_use_hint 2s 5s -60%
poly_use_hint_native 2s 4s -50%
polyt1_unpack 2s 3s -33%
polyveck_pack_w1 2s 4s -50%
polyz_unpack_native 2s 5s -60%
shake128_finalize 2s 2s +0%
shake128_init 2s 1s +100%
shake128_squeeze 2s 4s -50%
shake128x4_squeezeblocks 2s 2s +0%
shake256_squeeze 2s 2s +0%
keccakf1600_xor_bytes (big endian) 1s 3s -67%
make_hint 1s 4s -75%
mld_ct_sel_int32 1s 4s -75%
shake256 1s 2s -50%

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise branch 2 times, most recently from 1a600f1 to 8d662d1 Compare March 31, 2026 20:57
@jakemas jakemas marked this pull request as ready for review March 31, 2026 21:16
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakemas. Looks great overall.
I left a couple of small comments.

MLD_NAMESPACE(poly_pointwise_montgomery_asm)
void mld_poly_pointwise_montgomery_asm(int32_t *, const int32_t *,
const int32_t *);
void mld_poly_pointwise_montgomery_asm(int32_t *r, const int32_t *a,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same change should be made to aarch64_clean

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added!

@mkannwischer mkannwischer self-assigned this Apr 1, 2026
@jakemas jakemas force-pushed the jakemas/hol-light-pointwise branch 3 times, most recently from f7b2481 to f7dccd0 Compare April 1, 2026 19:18
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @jakemas. LGTM.

@hanno-becker, could you also take a look please?

@jakemas jakemas force-pushed the jakemas/hol-light-pointwise branch 3 times, most recently from 674af16 to 9c9f46c Compare April 3, 2026 17:26
Comment on lines +506 to +516
nonoverlapping (word pc, LENGTH mldsa_pointwise_tmc) (r, 1024) /\
nonoverlapping (word pc, LENGTH mldsa_pointwise_tmc) (a, 1024) /\
nonoverlapping (word pc, LENGTH mldsa_pointwise_tmc) (b, 1024) /\
nonoverlapping (word pc, LENGTH mldsa_pointwise_tmc) (consts, 2496) /\
nonoverlapping (r, 1024) (a, 1024) /\ nonoverlapping (r, 1024) (b, 1024) /\
nonoverlapping (r, 1024) (consts, 2496) /\ nonoverlapping (a, 1024) (b, 1024) /\
nonoverlapping (a, 1024) (consts, 2496) /\ nonoverlapping (b, 1024) (consts, 2496) /\
nonoverlapping (stackpointer, 8) (r, 1024) /\
nonoverlapping (stackpointer, 8) (a, 1024) /\
nonoverlapping (stackpointer, 8) (b, 1024) /\
nonoverlapping (stackpointer, 8) (consts, 2496)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't pretty; see #1018. Not a blocker.

Port the ML-DSA pointwise polynomial multiplication (Montgomery form)
and its HOL Light proofs of correctness from s2n-bignum to mldsa-native,
for both AArch64 (NEON) and x86_64 (AVX2).

The proofs verify the assembly implementations at the object-code level,
showing that output coefficients are congruent to the pointwise product
of the inputs modulo Q=8380417, with bounded output coefficients
(|output| <= Q-1). For AArch64, a constant-time and memory safety proof
is also included.

Ported from s2n-bignum commit ca6ec31a225a.

Signed-off-by: Jake Massimo <jakemas@amazon.com>
@hanno-becker hanno-becker force-pushed the jakemas/hol-light-pointwise branch from 9c9f46c to d2e2991 Compare April 5, 2026 16:09

let mldsa_pointwise = define
`mldsa_pointwise (f:num->int) (g:num->int) i =
(f i * g i * &(inverse_mod 8380417 4294967296)) rem &8380417`;;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone who reads this probably understands what is happening here, but a brief mention of Montgomery multiplication and the factor being 2^{-32} mod Q wouldn't hurt.

- name: mldsa_intt
needs: ["mldsa_specs.ml", "mldsa_utils.ml", "mldsa_zetas.ml"]
- name: mldsa_pointwise
needs: ["mldsa_specs.ml", "mldsa_utils.ml"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs mldsa_zetas.ml as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants