Goldilocks by TomWambsgans · Pull Request #210 · leanEthereum/leanVM

TomWambsgans · 2026-05-04T14:02:29Z

No description provided.

Co-authored-by: Copilot <copilot@github.com>

Bring main's MTU-XMSS structure (tweak table, public_param, T-Sponge with replacement) into the goldilocks branch with all poseidon-related sizes halved: field-element widths main (KoalaBear) goldilocks ------------------ ----------------- ---------- TWEAK_LEN 2 1 XMSS_DIGEST_LEN 4 2 RANDOMNESS_LEN_FE 6 3 MESSAGE_LEN_FE 8 4 PUBLIC_PARAM_LEN_FE 4 2 POSEIDON1_WIDTH 16 8 DIGEST_LEN_FE 8 4 Tweak table slots are 2 FE (1 actual tweak FE + 1 zero pad). The packed tweak fits in a single 64-bit Goldilocks element via `(tweak_type << 42) | (sub_position << 32) | index`. Port main's poseidon precompile features (`half_output`, `hardcoded_offset_left`) from Poseidon16 to Poseidon8, with new committed columns for the flags and `effective_index_left_first/second`. The half-output trace tail values are filled in a post-pass from `memory_padded` (lookup-only — the AIR doesn't constrain them). Encoding decomposition uses the goldilocks-proven 21 chunks of W=3 bits per FE with a factored 1-bit canonical check `(diff)·(diff − 2^63) == 0`, applied to the first 2 of 4 output FE for exactly V = 42 chunks (no V_GRINDING). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # crates/lean_compiler/snark_lib.py # crates/lean_compiler/tests/test_compiler.rs # crates/lean_compiler/tests/test_data/program_166.py # crates/lean_compiler/zkDSL.md # crates/rec_aggregation/zkdsl_implem/hashing.py # crates/rec_aggregation/zkdsl_implem/main.py

# Conflicts: # crates/lean_prover/src/lib.rs # crates/lean_prover/src/test_zkvm.rs

# Conflicts: # crates/backend/fiat-shamir/src/challenger.rs # crates/backend/fiat-shamir/tests/grinding.rs # crates/backend/sumcheck/src/product_computation.rs # crates/lean_prover/src/verify_execution.rs # crates/rec_aggregation/src/bytecode_claims.rs # crates/rec_aggregation/src/type_2_aggregation.rs # crates/rec_aggregation/zkdsl_implem/fiat_shamir.py # crates/rec_aggregation/zkdsl_implem/main.py # crates/sub_protocols/src/quotient_gkr/mod.rs # crates/utils/src/wrappers.rs # crates/whir/tests/run_whir.rs

Adapt main's column/flag renames (e.g. POSEIDON_*COL_INDEX_INPUT_LEFT -> POSEIDON_*COL_NU_A, EXT_OP_FLAG_MUL -> EXT_OP_FLAG_DOT_PRODUCT, ExtensionOp::PolyEq -> Eq, COL_COMP -> COL_ACC, etc.) to the goldilocks-specific code that uses Poseidon8 and cubic (DIMENSION=3) extension. Drop the KoalaBear-targeted python verifier and its check_whir_configs test, which don't apply to the goldilocks branch (folding_pow_bits was removed in goldilocks; WHIR_CONFIGS and Fp primitives are KoalaBear-specific). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adopt main's overwrite (permutation-based) sponge hashing on the Goldilocks branch, keeping Goldilocks field types throughout (WIDTH 8, RATE 4, DIGEST 4, poseidon8, cubic extension). Key reconciliations: - utils/symetric/whir/fiat-shamir: overwrite sponge (hash_slice_rtl, precompute_zero_suffix_state), poseidon_hash_slice, two-perm merkle_verify. - Poseidon table: kept the Goldilocks x^7 permutation AIR (sparse partial rounds) but adopted main's I/O interface, halved: 3-way output gating via flag_out2/flag_out4, added permute_half, unified Davies-Meyer output gates. New precompile set: poseidon8_compress_half/_quarter (+_hardcoded_left), poseidon8_permute/_permute_half (+_hardcoded_left). - Compiler, instruction encoder/display, prover trace post-pass updated to the new flags and names. - zkDSL verifier (hashing.py, main.py, xmss_aggregate.py) and XMSS signer (wots.rs) switched to the overwrite sponge; encoding decomposition and gl-specific constants (copy_ef/copy_digest, NUM_ENCODING_FE, strides) kept. - Dropped python-verifier (removed on goldilocks). Validated: workspace builds, fmt+clippy clean, poseidon AIR proves/verifies, compiler + lean_prover + xmss + sub_protocols tests pass, aggregation bytecode compiles, and end-to-end recursive XMSS aggregation proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adopt main's rec_aggregation rename (type 1/2 -> single/multi-message) and the new bytecode-claim Fiat-Shamir handling, keeping Goldilocks field types/sizes. Reconciliations: - bytecode_claims.rs: adopt main's direct claim ingestion into Fiat-Shamir (build_bytecode_claims_ingested_by_fiatshamir + observe_scalars, dropping hash_bytecode_claims), but keep Goldilocks poseidon8 (get_poseidon8). - multi_message_aggregation.rs: adopt build_multi_message_input_data name, keep DIGEST_LEN-generic layout comment. - zkdsl main.py: adopt single/multi-message naming and main's direct-ingestion reduce_bytecode_claims, but keep Goldilocks DIGEST_LEN-generic copy_digest loops (instead of main's hardcoded copy_8/copy_32) and SINGLE_MESSAGE flag placeholders provided by compilation.rs. - zkdsl hashing.py: slice_hash_continue uses poseidon8_permute_half (not the KoalaBear poseidon16 variant that auto-merged in). Validated: workspace builds, fmt+clippy clean, cargo testall passes, and end-to-end recursion (n=2) proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…idon1-8 Implemented and measured the Appendix-B sparse partial-round decomposition (the same one the AIR/trace-gen and the KoalaBear-16 permutation use) in the AVX-512 permutation. It is ~13% slower for Goldilocks: this circulant MDS has tiny entries {1,3,4,7,8,9} that strength-reduce to shifts/adds and batch 8 terms into a single reduce128 per output, while the sparse form needs arbitrary-constant 64x64 multiplies (one reduce128 each → 15 vs 8 reductions per partial round). Reverted the implementation, kept a comment so the dead end isn't re-explored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

cubic_mul_generic is the hottest field op in the prover after the poseidon permutation (~15% of an xmss prove, via sumcheck eq-eval and the poseidon AIR constraint eval). On Goldilocks each multiply carries a 128->64-bit reduction (the dominant cost), so 3-term Karatsuba trading 3 of the 9 multiplies for cheap field adds/subs is a net win across all packed backends. Measured: xmss --n-signatures 1550 --log-inv-rate 1 goes 392-394 -> 400-402 XMSS/s (~2%). Verified against the schoolbook reference (10k scalar + 2k packed random inputs) and end-to-end recursion still proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Bring in main's 13 post-merge-base commits (python-verifier cleanup, zkDSL compiler fixes, doc fixes, generic perf opts) while preserving the goldilocks field migration (Goldilocks + cubic extension, Poseidon8, 128-bit security, folding-pow-grinding removed). Conflict resolutions: - backend/air (lib.rs, constraint_folder/packed.rs): keep goldilocks's removal of the degree-split low_degree_block/skip_low machinery; take main's #[inline(always)] on the methods goldilocks keeps. - backend/sumcheck/product_computation.rs: keep goldilocks's retained compute_product_sumcheck_polynomial_base_ext_packed (main deleted it as dead; goldilocks keeps it for a planned Goldilocks optimization). - whir/merkle.rs: apply main's "remove Matrix trait" refactor (DenseMatrix -> Matrix struct, no M generic) onto goldilocks's Goldilocks/Poseidon8 field logic. - lean_vm isa/instruction.rs + tables/poseidon/mod.rs: keep goldilocks's POSEIDON8 naming; adopt main's intent of a generic table name ("poseidon8"). - Adopt main's no-loop-carried-mutables rule: rewrite goldilocks's soundness_4/5 and the ALL_PRECOMPILES_PROGRAM counter loop to buffers. - rec_aggregation/zkdsl_implem/whir.py: take main's buffer-carry structure with goldilocks's copy_ef/DIM and 5-value get_whir_params (no folding grinding); drop main's sumcheck_verify_with_grinding. - Drop the KoalaBear-specific python verifier (primitives.py, verifier.py, test_verify.py, check_whir_configs.rs) and its test-vector dumping in test_zkvm.rs; restore goldilocks's coherent Goldilocks test program. cargo fmt / clippy / testall all pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adapts main's 'Custom thread pool (#239)' (rayon -> parallel crate) to the goldilocks field branch: kept goldilocks's Poseidon8/poseidon8_compress and out2/out4 trace semantics while adopting the pool-based parallel helpers; converted the goldilocks-only product-sumcheck rayon path to parallel::map_reduce; dropped main's deleted compute_raw_poly_degree_split per goldilocks; removed the dead rayon dep from mt-goldilocks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

# Conflicts: # crates/xmss/src/xmss.rs

…, drop lz4/sha3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Brings main's "Clean recursion (#249)" + system-info runtime-detection changes onto the Goldilocks field migration. Conflict resolution (rec_aggregation zkDSL + whir/dft.rs): - Took main's renames (circle_values->stir_points, s->eval_weights, logup_c/alphas->logup_gamma/beta, final_coeffcients->final_coefficients, paded->padded) and dead-code removals (fs_hint, fs_print_state, merkle_* batch helpers, set_buf_prefix_right, PP_IN_LEFT, print_ef/vec). - Preserved goldilocks's Goldilocks migration: DIM=3/DIGEST_LEN=4, poseidon16->poseidon8, repurposed copy/zero helpers, EFFECTIVE_TWO_ADICITY, folding-grinding removal in whir.py, and the Goldilocks merkle-chunk dispatch cases (2/12/24/32), reconciling the two with main's stir_points parameter naming. Verified: cargo build, zkDSL compiles (2^19 instructions), cargo fmt, cargo clippy --workspace --tests (0 warnings), cargo test --all --release. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ed, see branch soundcalc-goldilocks) Co-authored-by: AKHIL MANGA <116151859+akhilmanga@users.noreply.github.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

TomWambsgans and others added 30 commits April 15, 2026 18:18

reduce degree AIR poseidon

3714bb6

wip

68e4e4c

wip

b9f7c21

test_plonky3_compatibility

bb7be6f

wip

82c624e

wip

d1c525f

degree 7 air (instead of 3) for poseidon

beaf0d6

w

c7448bc

wip

4b78c6e

wip

ae2401d

w

7771454

w

ff61a47

wip

ec26c52

w

4d91224

w

1daffe2

Merge branch 'main' into goldilocks

abc56ce

Co-authored-by: Copilot <copilot@github.com>

w

a635928

Co-authored-by: Copilot <copilot@github.com>

fix

ec4acbd

Merge remote-tracking branch 'origin/goldilocks' into goldilocks

26e06b9

low level optis

1663a9e

w

84f208b

w

80b3a98

2x faster poseidon

89a2dc5

much faster poseidn on avx512

6efc061

Merge commit 'a6f398eb3841acc74e424b788c0c50fd64df26f5' into goldilocks

7baaf62

w

c308fb6

better encoding

0470d7a

clippy

4c1209a

f

086ab06

Merge remote-tracking branch 'origin/HEAD' into goldilocks

961d2d6

TomWambsgans force-pushed the main branch from eacd019 to 9b2f632 Compare May 25, 2026 00:11

TomWambsgans and others added 5 commits May 26, 2026 01:46

Merge remote-tracking branch 'origin/main' into goldilocks

23537b0

# Conflicts: # crates/lean_prover/src/lib.rs # crates/lean_prover/src/test_zkvm.rs

clippy

dd0f092

TomWambsgans force-pushed the main branch 2 times, most recently from c5a3050 to 9dc5d68 Compare May 28, 2026 12:02

TomWambsgans and others added 8 commits May 29, 2026 01:06

Merge remote-tracking branch 'origin/main' into goldilocks

0feb0a5

recursion program: improve decompose_and_verify_merkle_query

5cf7027

Merge branch 'main' into goldilocks

85c4791

fix SIMD on permutation

fed15c5

latifkasuli mentioned this pull request Jun 3, 2026

lean_vm: add source-level failure stack traces #246

Open

TomWambsgans and others added 12 commits June 4, 2026 15:57

Merge branch 'main' into goldilocks

63fa99d

fmt

dc7f51d

Merge remote-tracking branch 'origin/main' into goldilocks

112f8f2

# Conflicts: # crates/xmss/src/xmss.rs

Merge remote-tracking branch 'origin/main' into goldilocks

cc228c4

Merge remote-tracking branch 'origin/main' into goldilocks

4f193f2

Merge origin/main into goldilocks (adapt Poseidon PRF to 63-bit field…

50309bd

…, drop lz4/sha3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

clippy

cca52a7

whir: document why folding PoW grinding was removed (soundcalc-verifi…

798900b

…ed, see branch soundcalc-goldilocks) Co-authored-by: AKHIL MANGA <116151859+akhilmanga@users.noreply.github.com>

Merge branch 'main' into goldilocks

671e6e7

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goldilocks#210

Goldilocks#210
TomWambsgans wants to merge 79 commits into
mainfrom
goldilocks

TomWambsgans commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TomWambsgans commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant