Skip to content

fix: update spec-test runners for the latest leanSpec fixture schema#429

Merged
pablodeymo merged 4 commits into
mainfrom
update-spec-test-runners
Jun 10, 2026
Merged

fix: update spec-test runners for the latest leanSpec fixture schema#429
pablodeymo merged 4 commits into
mainfrom
update-spec-test-runners

Conversation

@pablodeymo

@pablodeymo pablodeymo commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Motivation

#385 switched CI to download the latest leanSpec fixtures release instead of generating them from a pinned commit. The current release carries fixture-schema changes the test runners don't understand yet, so the Test job on main fails.

Note: leanSpec's release is a single rolling latest tag — it was republished mid-review (2026-06-10 17:31 UTC), which is what the second commit addresses.

Description

Runner/harness-only changes — no consensus behavior is touched, and ethlambda already matches the spec in every affected case:

  • tickToSlot (fork-choice block steps): new flag, default true. The early-arrival tests set it to false to deliver a block ahead of the store clock and assert the clock doesn't move. The forkchoice spectest runner and the Hive test driver now only tick to the block's slot when the flag is set.
  • rejectionReason: leanSpec renamed expectException. The verify_signatures and state_transition fixture types accept both spellings via a serde alias.
  • ssz_test: the SSZ fixture format string was renamed from ssz; the runner accepts both.
  • proofSetting / mocked proofs: leanSpec now fills most vectors with placeholder aggregation proofs (MOCKED-AGGREGATION-PROOF sentinel) instead of paying for recursive SNARK merges, marking the regime in a top-level proofSetting field (0 = mocked, must not be verified; 1 = real and valid; 2 = real and invalid). The forkchoice runner routes mocked vectors through a new on_gossip_aggregated_attestation_without_verification (mirroring the on_block/on_block_without_verification split). Production paths always verify.

Known follow-ups (out of scope):

  • The Hive test driver receives fork-choice steps one at a time with no fixture-level context, so it cannot honor proofSetting yet; Hive runs against mocked-proof fixtures will need a protocol addition.
  • Tracking the rolling latest release means any upstream republish can break main's CI at any time; pinning the fixtures tarball SHA256 would restore reproducibility.

How to Test

make test   # downloads latest release fixtures, all suites green

Locally verified against the 2026-06-10 17:31 UTC fixtures drop: full cargo test --workspace --release passes (108 forkchoice, 118 SSZ, 67 STF, all remaining suites green).

CI now downloads the latest leanSpec fixtures release (#385), and the
current release includes schema changes the test runners did not yet
understand, leaving main's Test job red:

- Block steps carry a tickToSlot flag (default true). The early-arrival
  tests set it to false to deliver a block ahead of the store clock, so
  the forkchoice runner and the Hive test driver now only advance the
  clock to the block's slot when the flag is set.
- Rejection expectations were renamed from expectException to
  rejectionReason; the verify_signatures and state_transition fixture
  types accept both spellings.
- The SSZ fixture format string was renamed from ssz to ssz_test; the
  runner accepts both.

No consensus-code changes: ethlambda already imports early blocks and
rejects the invalid signatures these fixtures cover; only the harness
misread the new encoding.
@github-actions

Copy link
Copy Markdown

🤖 Kimi Code Review

Review Summary

These are test-harness improvements to handle spec fixture format changes and early-arrival block scenarios. The changes are correct, well-documented, and maintain backward compatibility.

Detailed Feedback

  1. crates/common/test-fixtures/src/fork_choice.rs:79-83

    • Good use of #[serde(default = "default_true")] to maintain compatibility with existing fixtures while allowing new early-arrival tests to opt out.
    • Documentation clearly explains the consensus behavior being tested.
  2. crates/common/test-fixtures/src/verify_signatures.rs:47-50

    • Adding #[serde(default)] here aligns with state_transition.rs and prevents deserialization failures when the field is absent. This is correct behavior for optional expectation fields.
  3. crates/common/types/tests/ssz_spectests.rs:8-18

    • The change to support multiple fixture format names is correct. Consider using lazy_static! or phf if this list grows, but for two items the slice is fine.
    • Line 15: The contains check with &test.info.fixture_format.as_str() is correct for &[&str] (requires &&str).
  4. crates/blockchain/tests/forkchoice_spectests.rs:96-102 & crates/net/rpc/src/test_driver.rs:358-363

    • The conditional on_tick logic correctly implements early-arrival testing. When tick_to_slot is false, the block is delivered with the current store time (potentially from a past slot), which is the expected behavior for testing future-slot guards.
    • Note that on_tick is only skipped when explicitly disabled, preserving existing test behavior (Item 1 default is true).
  5. General

    • Serde aliases properly handle the leanSpec fixture naming transition (expectExceptionrejectionReason).
    • Error messages are informative and helpful for debugging fixture format mismatches.

Consensus Safety Check

  • These changes only affect test infrastructure, not production consensus code. The fork-choice store logic (on_tick, on_block_without_verification) remains unchanged; only the test driver's sequencing is modified.
  • Early-arrival test support is important for validating that the future-slot rejection logic (likely in on_block) correctly handles blocks arriving before their designated slot time.

No blocking issues found. The PR is ready for merge.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions

Copy link
Copy Markdown

🤖 Codex Code Review

No findings in the changed code.

The patch is narrow and the behavior change is sensible: tickToSlot defaults to true, so existing fork-choice fixtures keep their prior semantics, while newer early-arrival block cases can now exercise the future-slot path in both the offline runner and the RPC test driver (crates/common/test-fixtures/src/fork_choice.rs:79, crates/blockchain/tests/forkchoice_spectests.rs:96, crates/net/rpc/src/test_driver.rs:358). The rejectionReason aliases are also a safe compatibility improvement for the fixture loaders (crates/common/test-fixtures/src/state_transition.rs:22, crates/common/test-fixtures/src/verify_signatures.rs:50), and accepting both ssz and ssz_test is low risk in the SSZ test harness (crates/common/types/tests/ssz_spectests.rs:8).

I could not run the targeted Rust tests in this environment because cargo needs write access under ~/.cargo/~/.rustup and network access for dependencies, both of which are blocked here. Residual risk is therefore limited to unexecuted fixture coverage rather than anything suspicious in the diff itself.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR updates three spec-test runners and two fixture-type definitions to handle schema changes introduced by the latest leanSpec fixture release, without touching any consensus logic. All failures on main (one forkchoice test, 9 signature spectests, 118 SSZ spectests) are addressed by these harness-only adjustments.

  • tickToSlot flag (fork_choice.rs, forkchoice_spectests.rs, test_driver.rs): new bool field on ForkChoiceStep, defaulting to true to preserve existing behaviour; both the offline runner and the Hive test driver now skip on_tick when the flag is false, allowing early-arrival blocks to be delivered without advancing the store clock.
  • rejectionReason alias (state_transition.rs, verify_signatures.rs): expect_exception now deserializes from either expectException (legacy) or rejectionReason (new leanSpec name) via serde's alias attribute.
  • SSZ format string (ssz_spectests.rs): the runner now accepts both "ssz" (pinned-commit fixtures) and "ssz_test" (current release) as valid format identifiers.

Confidence Score: 5/5

All changes are confined to test runners and fixture deserialization types; no consensus, store, or production code paths are modified.

Every change is a backward-compatible extension: the new tick_to_slot field defaults to true so existing fixtures are unaffected, serde alias is additive for deserialization, and the SSZ format check is widened to a two-element allow-list. The symmetry between forkchoice_spectests.rs and test_driver.rs is maintained, and the author confirmed a full cargo test --workspace --release pass against the latest release fixtures.

No files require special attention.

Important Files Changed

Filename Overview
crates/common/test-fixtures/src/fork_choice.rs Adds tick_to_slot: bool to ForkChoiceStep with default = "default_true", correctly reusing the existing helper; default preserves all current test behaviour.
crates/blockchain/tests/forkchoice_spectests.rs Guards on_tick behind step.tick_to_slot; block import and outcome assertion are unaffected when the flag is false.
crates/net/rpc/src/test_driver.rs Mirrors the spec-test runner change: on_tick is skipped when tick_to_slot is false; no other logic is touched.
crates/common/test-fixtures/src/state_transition.rs Adds alias = "rejectionReason" to the already-default-annotated expect_exception field; clean backward-compatible addition.
crates/common/test-fixtures/src/verify_signatures.rs Adds default and alias = "rejectionReason" to expect_exception; default is technically redundant for Option<T> in serde_json but makes intent explicit.
crates/common/types/tests/ssz_spectests.rs Replaces single-string constant with a two-element slice; format check updated to SUPPORTED_FIXTURE_FORMATS.contains(…) with an improved error message.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[block step received] --> B{tick_to_slot?}
    B -- true default --> C[on_tick to block slot]
    C --> D[on_block_without_verification]
    B -- false early-arrival test --> D
    D --> E{result.is_ok?}
    E -- yes --> F[assert valid == true]
    E -- no --> G[assert valid == false]

    H[fixture field: expectException or rejectionReason] --> I[serde alias: both accepted]

    J[SSZ fixture_format field] --> K{format value?}
    K -- ssz --> L[accepted]
    K -- ssz_test --> L
    K -- other --> M[error: unsupported format]
Loading

Reviews (1): Last reviewed commit: "Update spec-test runners for the latest ..." | Re-trigger Greptile

@github-actions

Copy link
Copy Markdown

🤖 Claude Code Review

Here is the full review:


PR Review: fix: update spec-test runners for the latest leanSpec fixture schema

Scope: Test harness / fixture deserialization only. No consensus logic, state transition, fork choice, or cryptography code is modified. This simplifies the review considerably.


Correctness

tick_to_slot guard (fork_choice.rs, forkchoice_spectests.rs, test_driver.rs)

The design is correct. default = "default_true" preserves the old always-tick behavior for existing fixtures that don't carry the field, while new early-arrival fixtures can set it to false to hold the store clock in place. The default_true helper is already used for the valid field in the same struct, so reusing it is clean.

Both call sites (forkchoice_spectests.rs and test_driver.rs) were updated identically — no drift between the two runners.

serde alias for rejectionReason / expectException

This fixes a silent correctness failure, not just a deserialization error. Without the alias:

  • Fixtures using rejectionReason would silently produce expect_exception = None
  • Tests that expect a rejection would be evaluated as if they expect success
  • If the implementation correctly rejects the block, the test fails — but for the wrong reason (the harness expected Ok, not the fixture's intended Err)

The #[serde(rename = "expectException", alias = "rejectionReason")] combination is the right tool: rename keeps the primary deserialization key as the old name (backward-compat), alias adds the new name. state_transition.rs already had default; adding it to verify_signatures.rs is correct and consistent.

Minor note: #[serde(default)] is technically redundant on Option<String> — serde already defaults missing Option fields to None — but it makes intent explicit and matches state_transition.rs. No problem.

SSZ format string (ssz_spectests.rs)

The &[&str].contains(&str) pattern is idiomatic and correct. With two elements, .contains() is a straightforward linear scan — no concern there. The updated error message with {:?} on the slice is more informative than the old single-value message.

One pre-existing behavior preserved by this PR: the run function returns early on the first unsupported format case within a fixture file. If a fixture file ever mixes supported and unsupported formats (unlikely in practice), earlier tests would still run before the error surfaces. This isn't introduced by this PR and is acceptable for a test harness.


Consistency

The tick_to_slot guard is applied in both forkchoice_spectests.rs (offline runner) and test_driver.rs (Hive driver), and the field is defined once in fork_choice.rs (shared by both). This is the correct factoring.


Minor Observation

The doc comment at verify_signatures.rs:115–116 (inside impl From<TestSignedBlock>) still references expectException by name. That comment is about panic-on-proxy-failure semantics, not the JSON field name, so it remains accurate — but a future reader will see both spellings in the file and may wonder about the discrepancy. A one-word update like "Tests with expect_exception set..." (the Rust field name) would be less confusing.


Summary

The changes are correct, backward-compatible, and well-scoped. The most important fix is the serde alias that prevents silent test-correctness failures on rejection-case fixtures. LGTM.


Automated review by Claude (Anthropic) · sonnet · custom prompt

leanSpec now generates most vectors with placeholder aggregation proofs
("MOCKED-AGGREGATION-PROOF" sentinel bytes) so filling no longer pays
for recursive SNARK merges; a top-level proofSetting field marks the
regime (0 = mocked, must not be verified; 1 = real and valid; 2 = real
and invalid). Vectors that actually exercise proof validity opt back
into real crypto.

Parse proofSetting (defaulting to real-and-valid for older fixtures)
and route gossipAggregatedAttestation steps of mocked vectors through a
new on_gossip_aggregated_attestation_without_verification, mirroring
the existing on_block/on_block_without_verification split. Production
paths are unchanged and always verify.
Comment on lines +8 to +10
/// Newer leanSpec releases name the format `ssz_test`; the pinned-commit
/// fixtures still use `ssz`. Accept both.
const SUPPORTED_FIXTURE_FORMATS: &[&str] = &["ssz", "ssz_test"];

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If newer leanSpec releases use ssz_test, we should only support that

@MegaRedHand MegaRedHand left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +404 to +409
pub fn on_gossip_aggregated_attestation_without_verification(
store: &mut Store,
aggregated: SignedAggregatedAttestation,
) -> Result<(), StoreError> {
on_gossip_aggregated_attestation_core(store, aggregated, false)
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a better way to do this.

  The ssz spelling was renamed to ssz_test in leanSpec's fixture
  consistency pass (leanSpec #864, merged 2026-06-06) and CI always
  downloads the latest release, so the old name can never appear again.
  Addresses review feedback.
@pablodeymo pablodeymo merged commit e9fa1f8 into main Jun 10, 2026
2 checks passed
@pablodeymo pablodeymo deleted the update-spec-test-runners branch June 10, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants