Goldilocks#210
Open
TomWambsgans wants to merge 79 commits into
Open
Conversation
Co-authored-by: Copilot <copilot@github.com>
Bring main's MTU-XMSS structure (tweak table, public_param, T-Sponge with replacement) into the goldilocks branch with all poseidon-related sizes halved: field-element widths main (KoalaBear) goldilocks ------------------ ----------------- ---------- TWEAK_LEN 2 1 XMSS_DIGEST_LEN 4 2 RANDOMNESS_LEN_FE 6 3 MESSAGE_LEN_FE 8 4 PUBLIC_PARAM_LEN_FE 4 2 POSEIDON1_WIDTH 16 8 DIGEST_LEN_FE 8 4 Tweak table slots are 2 FE (1 actual tweak FE + 1 zero pad). The packed tweak fits in a single 64-bit Goldilocks element via `(tweak_type << 42) | (sub_position << 32) | index`. Port main's poseidon precompile features (`half_output`, `hardcoded_offset_left`) from Poseidon16 to Poseidon8, with new committed columns for the flags and `effective_index_left_first/second`. The half-output trace tail values are filled in a post-pass from `memory_padded` (lookup-only — the AIR doesn't constrain them). Encoding decomposition uses the goldilocks-proven 21 chunks of W=3 bits per FE with a factored 1-bit canonical check `(diff)·(diff − 2^63) == 0`, applied to the first 2 of 4 output FE for exactly V = 42 chunks (no V_GRINDING). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # crates/lean_compiler/snark_lib.py # crates/lean_compiler/tests/test_compiler.rs # crates/lean_compiler/tests/test_data/program_166.py # crates/lean_compiler/zkDSL.md # crates/rec_aggregation/zkdsl_implem/hashing.py # crates/rec_aggregation/zkdsl_implem/main.py
# Conflicts: # crates/lean_prover/src/lib.rs # crates/lean_prover/src/test_zkvm.rs
# Conflicts: # crates/backend/fiat-shamir/src/challenger.rs # crates/backend/fiat-shamir/tests/grinding.rs # crates/backend/sumcheck/src/product_computation.rs # crates/lean_prover/src/verify_execution.rs # crates/rec_aggregation/src/bytecode_claims.rs # crates/rec_aggregation/src/type_2_aggregation.rs # crates/rec_aggregation/zkdsl_implem/fiat_shamir.py # crates/rec_aggregation/zkdsl_implem/main.py # crates/sub_protocols/src/quotient_gkr/mod.rs # crates/utils/src/wrappers.rs # crates/whir/tests/run_whir.rs
Adapt main's column/flag renames (e.g. POSEIDON_*COL_INDEX_INPUT_LEFT -> POSEIDON_*COL_NU_A, EXT_OP_FLAG_MUL -> EXT_OP_FLAG_DOT_PRODUCT, ExtensionOp::PolyEq -> Eq, COL_COMP -> COL_ACC, etc.) to the goldilocks-specific code that uses Poseidon8 and cubic (DIMENSION=3) extension. Drop the KoalaBear-targeted python verifier and its check_whir_configs test, which don't apply to the goldilocks branch (folding_pow_bits was removed in goldilocks; WHIR_CONFIGS and Fp primitives are KoalaBear-specific). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c5a3050 to
9dc5d68
Compare
Adopt main's overwrite (permutation-based) sponge hashing on the Goldilocks branch, keeping Goldilocks field types throughout (WIDTH 8, RATE 4, DIGEST 4, poseidon8, cubic extension). Key reconciliations: - utils/symetric/whir/fiat-shamir: overwrite sponge (hash_slice_rtl, precompute_zero_suffix_state), poseidon_hash_slice, two-perm merkle_verify. - Poseidon table: kept the Goldilocks x^7 permutation AIR (sparse partial rounds) but adopted main's I/O interface, halved: 3-way output gating via flag_out2/flag_out4, added permute_half, unified Davies-Meyer output gates. New precompile set: poseidon8_compress_half/_quarter (+_hardcoded_left), poseidon8_permute/_permute_half (+_hardcoded_left). - Compiler, instruction encoder/display, prover trace post-pass updated to the new flags and names. - zkDSL verifier (hashing.py, main.py, xmss_aggregate.py) and XMSS signer (wots.rs) switched to the overwrite sponge; encoding decomposition and gl-specific constants (copy_ef/copy_digest, NUM_ENCODING_FE, strides) kept. - Dropped python-verifier (removed on goldilocks). Validated: workspace builds, fmt+clippy clean, poseidon AIR proves/verifies, compiler + lean_prover + xmss + sub_protocols tests pass, aggregation bytecode compiles, and end-to-end recursive XMSS aggregation proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adopt main's rec_aggregation rename (type 1/2 -> single/multi-message) and the new bytecode-claim Fiat-Shamir handling, keeping Goldilocks field types/sizes. Reconciliations: - bytecode_claims.rs: adopt main's direct claim ingestion into Fiat-Shamir (build_bytecode_claims_ingested_by_fiatshamir + observe_scalars, dropping hash_bytecode_claims), but keep Goldilocks poseidon8 (get_poseidon8). - multi_message_aggregation.rs: adopt build_multi_message_input_data name, keep DIGEST_LEN-generic layout comment. - zkdsl main.py: adopt single/multi-message naming and main's direct-ingestion reduce_bytecode_claims, but keep Goldilocks DIGEST_LEN-generic copy_digest loops (instead of main's hardcoded copy_8/copy_32) and SINGLE_MESSAGE flag placeholders provided by compilation.rs. - zkdsl hashing.py: slice_hash_continue uses poseidon8_permute_half (not the KoalaBear poseidon16 variant that auto-merged in). Validated: workspace builds, fmt+clippy clean, cargo testall passes, and end-to-end recursion (n=2) proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…idon1-8
Implemented and measured the Appendix-B sparse partial-round decomposition
(the same one the AIR/trace-gen and the KoalaBear-16 permutation use) in the
AVX-512 permutation. It is ~13% slower for Goldilocks: this circulant MDS has
tiny entries {1,3,4,7,8,9} that strength-reduce to shifts/adds and batch 8
terms into a single reduce128 per output, while the sparse form needs
arbitrary-constant 64x64 multiplies (one reduce128 each → 15 vs 8 reductions
per partial round). Reverted the implementation, kept a comment so the dead
end isn't re-explored.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cubic_mul_generic is the hottest field op in the prover after the poseidon permutation (~15% of an xmss prove, via sumcheck eq-eval and the poseidon AIR constraint eval). On Goldilocks each multiply carries a 128->64-bit reduction (the dominant cost), so 3-term Karatsuba trading 3 of the 9 multiplies for cheap field adds/subs is a net win across all packed backends. Measured: xmss --n-signatures 1550 --log-inv-rate 1 goes 392-394 -> 400-402 XMSS/s (~2%). Verified against the schoolbook reference (10k scalar + 2k packed random inputs) and end-to-end recursion still proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bring in main's 13 post-merge-base commits (python-verifier cleanup,
zkDSL compiler fixes, doc fixes, generic perf opts) while preserving the
goldilocks field migration (Goldilocks + cubic extension, Poseidon8,
128-bit security, folding-pow-grinding removed).
Conflict resolutions:
- backend/air (lib.rs, constraint_folder/packed.rs): keep goldilocks's
removal of the degree-split low_degree_block/skip_low machinery; take
main's #[inline(always)] on the methods goldilocks keeps.
- backend/sumcheck/product_computation.rs: keep goldilocks's retained
compute_product_sumcheck_polynomial_base_ext_packed (main deleted it as
dead; goldilocks keeps it for a planned Goldilocks optimization).
- whir/merkle.rs: apply main's "remove Matrix trait" refactor (DenseMatrix
-> Matrix struct, no M generic) onto goldilocks's Goldilocks/Poseidon8
field logic.
- lean_vm isa/instruction.rs + tables/poseidon/mod.rs: keep goldilocks's
POSEIDON8 naming; adopt main's intent of a generic table name ("poseidon8").
- Adopt main's no-loop-carried-mutables rule: rewrite goldilocks's
soundness_4/5 and the ALL_PRECOMPILES_PROGRAM counter loop to buffers.
- rec_aggregation/zkdsl_implem/whir.py: take main's buffer-carry structure
with goldilocks's copy_ef/DIM and 5-value get_whir_params (no folding
grinding); drop main's sumcheck_verify_with_grinding.
- Drop the KoalaBear-specific python verifier (primitives.py, verifier.py,
test_verify.py, check_whir_configs.rs) and its test-vector dumping in
test_zkvm.rs; restore goldilocks's coherent Goldilocks test program.
cargo fmt / clippy / testall all pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adapts main's 'Custom thread pool (#239)' (rayon -> parallel crate) to the goldilocks field branch: kept goldilocks's Poseidon8/poseidon8_compress and out2/out4 trace semantics while adopting the pool-based parallel helpers; converted the goldilocks-only product-sumcheck rayon path to parallel::map_reduce; dropped main's deleted compute_raw_poly_degree_split per goldilocks; removed the dead rayon dep from mt-goldilocks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts: # crates/xmss/src/xmss.rs
…, drop lz4/sha3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings main's "Clean recursion (#249)" + system-info runtime-detection changes onto the Goldilocks field migration. Conflict resolution (rec_aggregation zkDSL + whir/dft.rs): - Took main's renames (circle_values->stir_points, s->eval_weights, logup_c/alphas->logup_gamma/beta, final_coeffcients->final_coefficients, paded->padded) and dead-code removals (fs_hint, fs_print_state, merkle_* batch helpers, set_buf_prefix_right, PP_IN_LEFT, print_ef/vec). - Preserved goldilocks's Goldilocks migration: DIM=3/DIGEST_LEN=4, poseidon16->poseidon8, repurposed copy/zero helpers, EFFECTIVE_TWO_ADICITY, folding-grinding removal in whir.py, and the Goldilocks merkle-chunk dispatch cases (2/12/24/32), reconciling the two with main's stir_points parameter naming. Verified: cargo build, zkDSL compiles (2^19 instructions), cargo fmt, cargo clippy --workspace --tests (0 warnings), cargo test --all --release. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed, see branch soundcalc-goldilocks) Co-authored-by: AKHIL MANGA <116151859+akhilmanga@users.noreply.github.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.