Skip to content

Goldilocks#210

Open
TomWambsgans wants to merge 79 commits into
mainfrom
goldilocks
Open

Goldilocks#210
TomWambsgans wants to merge 79 commits into
mainfrom
goldilocks

Conversation

@TomWambsgans

Copy link
Copy Markdown
Collaborator

No description provided.

TomWambsgans and others added 30 commits April 15, 2026 18:18
Co-authored-by: Copilot <copilot@github.com>
w
Co-authored-by: Copilot <copilot@github.com>
Bring main's MTU-XMSS structure (tweak table, public_param, T-Sponge with
replacement) into the goldilocks branch with all poseidon-related sizes
halved:

  field-element widths    main (KoalaBear)   goldilocks
  ------------------    -----------------   ----------
  TWEAK_LEN                 2                 1
  XMSS_DIGEST_LEN           4                 2
  RANDOMNESS_LEN_FE         6                 3
  MESSAGE_LEN_FE            8                 4
  PUBLIC_PARAM_LEN_FE       4                 2
  POSEIDON1_WIDTH          16                 8
  DIGEST_LEN_FE             8                 4

Tweak table slots are 2 FE (1 actual tweak FE + 1 zero pad). The packed
tweak fits in a single 64-bit Goldilocks element via
`(tweak_type << 42) | (sub_position << 32) | index`.

Port main's poseidon precompile features (`half_output`,
`hardcoded_offset_left`) from Poseidon16 to Poseidon8, with new committed
columns for the flags and `effective_index_left_first/second`. The
half-output trace tail values are filled in a post-pass from
`memory_padded` (lookup-only — the AIR doesn't constrain them).

Encoding decomposition uses the goldilocks-proven 21 chunks of W=3 bits
per FE with a factored 1-bit canonical check
`(diff)·(diff − 2^63) == 0`, applied to the first 2 of 4 output FE for
exactly V = 42 chunks (no V_GRINDING).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TomWambsgans and others added 5 commits May 26, 2026 01:46
# Conflicts:
#	crates/lean_compiler/snark_lib.py
#	crates/lean_compiler/tests/test_compiler.rs
#	crates/lean_compiler/tests/test_data/program_166.py
#	crates/lean_compiler/zkDSL.md
#	crates/rec_aggregation/zkdsl_implem/hashing.py
#	crates/rec_aggregation/zkdsl_implem/main.py
# Conflicts:
#	crates/lean_prover/src/lib.rs
#	crates/lean_prover/src/test_zkvm.rs
# Conflicts:
#	crates/backend/fiat-shamir/src/challenger.rs
#	crates/backend/fiat-shamir/tests/grinding.rs
#	crates/backend/sumcheck/src/product_computation.rs
#	crates/lean_prover/src/verify_execution.rs
#	crates/rec_aggregation/src/bytecode_claims.rs
#	crates/rec_aggregation/src/type_2_aggregation.rs
#	crates/rec_aggregation/zkdsl_implem/fiat_shamir.py
#	crates/rec_aggregation/zkdsl_implem/main.py
#	crates/sub_protocols/src/quotient_gkr/mod.rs
#	crates/utils/src/wrappers.rs
#	crates/whir/tests/run_whir.rs
Adapt main's column/flag renames (e.g. POSEIDON_*COL_INDEX_INPUT_LEFT ->
POSEIDON_*COL_NU_A, EXT_OP_FLAG_MUL -> EXT_OP_FLAG_DOT_PRODUCT,
ExtensionOp::PolyEq -> Eq, COL_COMP -> COL_ACC, etc.) to the
goldilocks-specific code that uses Poseidon8 and cubic (DIMENSION=3)
extension. Drop the KoalaBear-targeted python verifier and its
check_whir_configs test, which don't apply to the goldilocks branch
(folding_pow_bits was removed in goldilocks; WHIR_CONFIGS and Fp
primitives are KoalaBear-specific).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TomWambsgans TomWambsgans force-pushed the main branch 2 times, most recently from c5a3050 to 9dc5d68 Compare May 28, 2026 12:02
TomWambsgans and others added 8 commits May 29, 2026 01:06
Adopt main's overwrite (permutation-based) sponge hashing on the Goldilocks
branch, keeping Goldilocks field types throughout (WIDTH 8, RATE 4, DIGEST 4,
poseidon8, cubic extension).

Key reconciliations:
- utils/symetric/whir/fiat-shamir: overwrite sponge (hash_slice_rtl,
  precompute_zero_suffix_state), poseidon_hash_slice, two-perm merkle_verify.
- Poseidon table: kept the Goldilocks x^7 permutation AIR (sparse partial
  rounds) but adopted main's I/O interface, halved: 3-way output gating via
  flag_out2/flag_out4, added permute_half, unified Davies-Meyer output gates.
  New precompile set: poseidon8_compress_half/_quarter (+_hardcoded_left),
  poseidon8_permute/_permute_half (+_hardcoded_left).
- Compiler, instruction encoder/display, prover trace post-pass updated to the
  new flags and names.
- zkDSL verifier (hashing.py, main.py, xmss_aggregate.py) and XMSS signer
  (wots.rs) switched to the overwrite sponge; encoding decomposition and
  gl-specific constants (copy_ef/copy_digest, NUM_ENCODING_FE, strides) kept.
- Dropped python-verifier (removed on goldilocks).

Validated: workspace builds, fmt+clippy clean, poseidon AIR proves/verifies,
compiler + lean_prover + xmss + sub_protocols tests pass, aggregation bytecode
compiles, and end-to-end recursive XMSS aggregation proves+verifies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adopt main's rec_aggregation rename (type 1/2 -> single/multi-message) and the
new bytecode-claim Fiat-Shamir handling, keeping Goldilocks field types/sizes.

Reconciliations:
- bytecode_claims.rs: adopt main's direct claim ingestion into Fiat-Shamir
  (build_bytecode_claims_ingested_by_fiatshamir + observe_scalars, dropping
  hash_bytecode_claims), but keep Goldilocks poseidon8 (get_poseidon8).
- multi_message_aggregation.rs: adopt build_multi_message_input_data name,
  keep DIGEST_LEN-generic layout comment.
- zkdsl main.py: adopt single/multi-message naming and main's direct-ingestion
  reduce_bytecode_claims, but keep Goldilocks DIGEST_LEN-generic copy_digest
  loops (instead of main's hardcoded copy_8/copy_32) and SINGLE_MESSAGE flag
  placeholders provided by compilation.rs.
- zkdsl hashing.py: slice_hash_continue uses poseidon8_permute_half (not the
  KoalaBear poseidon16 variant that auto-merged in).

Validated: workspace builds, fmt+clippy clean, cargo testall passes, and
end-to-end recursion (n=2) proves+verifies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…idon1-8

Implemented and measured the Appendix-B sparse partial-round decomposition
(the same one the AIR/trace-gen and the KoalaBear-16 permutation use) in the
AVX-512 permutation. It is ~13% slower for Goldilocks: this circulant MDS has
tiny entries {1,3,4,7,8,9} that strength-reduce to shifts/adds and batch 8
terms into a single reduce128 per output, while the sparse form needs
arbitrary-constant 64x64 multiplies (one reduce128 each → 15 vs 8 reductions
per partial round). Reverted the implementation, kept a comment so the dead
end isn't re-explored.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cubic_mul_generic is the hottest field op in the prover after the poseidon
permutation (~15% of an xmss prove, via sumcheck eq-eval and the poseidon AIR
constraint eval). On Goldilocks each multiply carries a 128->64-bit reduction
(the dominant cost), so 3-term Karatsuba trading 3 of the 9 multiplies for
cheap field adds/subs is a net win across all packed backends.

Measured: xmss --n-signatures 1550 --log-inv-rate 1 goes 392-394 -> 400-402
XMSS/s (~2%). Verified against the schoolbook reference (10k scalar + 2k packed
random inputs) and end-to-end recursion still proves+verifies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TomWambsgans and others added 12 commits June 4, 2026 15:57
Bring in main's 13 post-merge-base commits (python-verifier cleanup,
zkDSL compiler fixes, doc fixes, generic perf opts) while preserving the
goldilocks field migration (Goldilocks + cubic extension, Poseidon8,
128-bit security, folding-pow-grinding removed).

Conflict resolutions:
- backend/air (lib.rs, constraint_folder/packed.rs): keep goldilocks's
  removal of the degree-split low_degree_block/skip_low machinery; take
  main's #[inline(always)] on the methods goldilocks keeps.
- backend/sumcheck/product_computation.rs: keep goldilocks's retained
  compute_product_sumcheck_polynomial_base_ext_packed (main deleted it as
  dead; goldilocks keeps it for a planned Goldilocks optimization).
- whir/merkle.rs: apply main's "remove Matrix trait" refactor (DenseMatrix
  -> Matrix struct, no M generic) onto goldilocks's Goldilocks/Poseidon8
  field logic.
- lean_vm isa/instruction.rs + tables/poseidon/mod.rs: keep goldilocks's
  POSEIDON8 naming; adopt main's intent of a generic table name ("poseidon8").
- Adopt main's no-loop-carried-mutables rule: rewrite goldilocks's
  soundness_4/5 and the ALL_PRECOMPILES_PROGRAM counter loop to buffers.
- rec_aggregation/zkdsl_implem/whir.py: take main's buffer-carry structure
  with goldilocks's copy_ef/DIM and 5-value get_whir_params (no folding
  grinding); drop main's sumcheck_verify_with_grinding.
- Drop the KoalaBear-specific python verifier (primitives.py, verifier.py,
  test_verify.py, check_whir_configs.rs) and its test-vector dumping in
  test_zkvm.rs; restore goldilocks's coherent Goldilocks test program.

cargo fmt / clippy / testall all pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adapts main's 'Custom thread pool (#239)' (rayon -> parallel crate) to the
goldilocks field branch: kept goldilocks's Poseidon8/poseidon8_compress and
out2/out4 trace semantics while adopting the pool-based parallel helpers;
converted the goldilocks-only product-sumcheck rayon path to parallel::map_reduce;
dropped main's deleted compute_raw_poly_degree_split per goldilocks; removed the
dead rayon dep from mt-goldilocks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts:
#	crates/xmss/src/xmss.rs
…, drop lz4/sha3)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings main's "Clean recursion (#249)" + system-info runtime-detection
changes onto the Goldilocks field migration.

Conflict resolution (rec_aggregation zkDSL + whir/dft.rs):
- Took main's renames (circle_values->stir_points, s->eval_weights,
  logup_c/alphas->logup_gamma/beta, final_coeffcients->final_coefficients,
  paded->padded) and dead-code removals (fs_hint, fs_print_state, merkle_*
  batch helpers, set_buf_prefix_right, PP_IN_LEFT, print_ef/vec).
- Preserved goldilocks's Goldilocks migration: DIM=3/DIGEST_LEN=4,
  poseidon16->poseidon8, repurposed copy/zero helpers, EFFECTIVE_TWO_ADICITY,
  folding-grinding removal in whir.py, and the Goldilocks merkle-chunk
  dispatch cases (2/12/24/32), reconciling the two with main's stir_points
  parameter naming.

Verified: cargo build, zkDSL compiles (2^19 instructions), cargo fmt,
cargo clippy --workspace --tests (0 warnings), cargo test --all --release.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed, see branch soundcalc-goldilocks)

Co-authored-by: AKHIL MANGA <116151859+akhilmanga@users.noreply.github.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant