Skip to content

test: add new SSZ spec-tests runner#282

Open
MegaRedHand wants to merge 4 commits intodevnet4from
ssz-genesis-justification-spectests
Open

test: add new SSZ spec-tests runner#282
MegaRedHand wants to merge 4 commits intodevnet4from
ssz-genesis-justification-spectests

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

This PR adds a test runner for the new SSZ spec-test format. It also adds some new post-state checks to the STF spec-test runner.

@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

This PR adds SSZ consensus spec tests and improves state transition test coverage with label-based block root resolution. The changes are primarily additive and don't modify consensus logic.

Issues Found:

1. Potential panics on malformed test fixtures (crates/common/types/tests/ssz_types.rs)

Multiple unwrap() and expect() calls in From implementations will panic on malformed fixtures instead of returning descriptive errors. Since these parse external test data, they should use TryFrom or return errors.

  • Lines 330, 332-338, 340: SszList::try_from(...).unwrap() in TestState conversion
  • Lines 345, 350: justified_slots.push(b).unwrap() and justifications_validators.push(b).unwrap()
  • Line 408: SszList::try_from(attestations).expect("too many attestations")
  • Line 455: bits.push(b).unwrap()
  • Line 537: AttestationSignatures::try_from(...).expect(...)
  • Line 550: ByteListMiB::try_from(...).expect(...)
  • Line 564: hex::decode(...).expect(...) in HexByteList conversion

Recommendation: Convert these to TryFrom implementations or map errors to datatest_stable::Result to provide context when fixtures exceed SSZ list bounds.

2. Redundant dependency (crates/common/types/Cargo.toml)

  • Line 28: datatest-stable = "0.3.3" is added to types/Cargo.toml, but Cargo.lock shows it's already a transitive dependency. If this is intentional for explicit versioning, consider using workspace = true if other crates use it, otherwise ensure version consistency.

3. Unused import (crates/common/types/tests/ssz_spectests.rs)

  • Line 3: use ethlambda_types::primitives::HashTreeRoot; is imported but later referenced via fully qualified syntax on line 131. Either use the trait method directly (domain_value.hash_tree_root()) or keep the import and remove the HashTreeRoot:: prefix.

Correctness Verification:

The label-based root resolution in stf_spectests.rs (lines 35-44) correctly:

  • Uses 1-indexed labels ("block_1" for array index 0)
  • Registers block roots before state transition, ensuring post-state checks can reference the current block
  • Provides helpful error messages listing available labels in resolve_label (lines 47-57)

The SSZ round-trip tests (lines 94-123) correctly verify:

  • Canonical encoding matches fixture bytes
  • Decoding + re-encoding is identity (round-trip)
  • Hash tree root calculation matches expected Merkle roots

Security: No vulnerabilities introduced. The HashTreeRoot derives added to SignedBlock, BlockSignatures, and attestation types (lines in attestation.rs and block.rs) are standard SSZ container hashing operations required for consensus.

Recommendation: Address the panics in test fixture parsing (Item 1) to improve debuggability when specs change or fixtures are malformed. The consensus logic additions are correct.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have a complete picture of the PR. Here is my review:


PR #282 Review: Add SSZ Spec-Test Runner

Overview

This PR adds a comprehensive SSZ spec-test runner (crates/common/types/tests/ssz_spectests.rs) that validates: JSON deserialization → SSZ encoding → round-trip decode → hash tree root. It also extends the STF spec-test runner with label-based root assertions and collection count checks. The structural approach is sound and the 4-step validation pipeline is thorough.


Critical Issue

datatest-stable added as a production dependency instead of dev-dependency

In crates/common/types/Cargo.toml, the new line:

datatest-stable = "0.3.3"

appears in the [dependencies] section (after rand.workspace = true), not [dev-dependencies]. This compiles a test harness framework into every build that links ethlambda-types. It should be:

[dev-dependencies]
datatest-stable = "0.3.3"

The Cargo.lock confirms it was added to ethlambda-types dependencies, not dev-dependencies. This is the single blocking issue.


Moderate Issues

Non-deterministic test ordering in SszTestVector

SszTestVector uses HashMap<String, SszTestCase> (ssz_types.rs:437), so iteration order varies between runs. Test output like "Running SSZ test: {name}" will appear in arbitrary order, making it harder to correlate failures with fixture positions. A BTreeMap<String, SszTestCase> or a Vec<(String, SszTestCase)> via a custom deserializer would give stable ordering. Since JSON objects are unordered this may be a spec constraint, but it's worth noting.

Panics in fixture conversion produce poor error messages

Several conversions in ssz_types.rs panic without context on malformed fixtures:

  • ssz_types.rs:908: hex::decode(stripped).expect("invalid hex in proof data") — in HexByteList::From<Vec<u8>>, which is called during Into<D> inside run_typed_test. A panic here produces no test name or fixture path. Making HexByteList::from return a Result and threading it through would give a proper datatest_stable::Error.
  • ssz_types.rs:731: .expect("too many attestations") and ssz_types.rs:876: .expect("too many attestation signatures") — same concern. These will panic on oversized fixtures with no file/test context.
  • ssz_types.rs:894: .expect("proof data too large")
  • ssz_types.rs:665, 671: justified_slots.push(b).unwrap() / justifications_validators.push(b).unwrap() — will panic if fixture bitlist exceeds type capacity.

For production test infrastructure these are acceptable in the short term, but they make fixture debugging painful.


Minor Observations

SUPPORTED_FIXTURE_FORMAT check fails the whole file on mismatch

In ssz_spectests.rs:261–265, an unsupported fixtureFormat returns Err(...), failing the entire test file. This is stricter than the unsupported typeName path which gracefully skips with a println!. If the spec later introduces new formats in existing files this will hard-fail rather than warn. Probably intentional but worth a comment explaining the design choice.

Label-based vs. direct checks are not mutually exclusive

In types.rs:154–171, both justifications_roots (direct value) and justifications_roots_labels (label-based) can be set simultaneously in a fixture. Both checks in compare_post_states would run. This works correctly but could produce confusing double-failure messages if there's a bug. A comment noting that they are independent checks (not alternatives) would help future fixture authors.

Block registry includes the root of the failing block

In stf_spectests.rs:39–45, block_registry.insert(label, block.hash_tree_root()) runs before state_transition(...) and before the break. So a failing block's root is still registered. This is correct — label resolution should match block identity, not application success — but the comment could say this explicitly.


What's Done Well

  • The 4-step SSZ validation pipeline (JSON → encode → round-trip → hash tree root) is comprehensive and matches the spec's intent.
  • resolve_label at stf_spectests.rs:59–71 gives a clear error including all available keys — good diagnostic quality.
  • HashTreeRoot derives added to SignedAttestation, SignedAggregatedAttestation, SignedBlock, BlockSignatures, and AggregatedSignatureProof are all appropriate.
  • Clean fixture/domain type separation in ssz_types.rs keeps domain types unaffected.
  • 1-indexed block labels ("block_1" for index 0) are documented in both comment and code, preventing off-by-one confusion.
  • The graceful "Skipping unsupported type" path at ssz_spectests.rs:325–328 is forward-compatible.

Summary

One blocking issue: move datatest-stable to [dev-dependencies]. Everything else is non-blocking — the moderate panics are acceptable test-code trade-offs but should be tracked.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. Medium: ssz_types.rs deserializes SignedBlock from a block field only. The existing signature spectest loader already had to accept message as an alias for the same shape in signature_types.rs. If the SSZ fixtures use the canonical message key, every SignedBlock case will fail to deserialize. Add #[serde(alias = "message")] here as well, or rename the field to match the fixture schema.

  2. Low: ssz_spectests.rs silently skips unknown typeNames. For a conformance harness, that is risky: adding new consensus containers to the fixture set can leave CI green while coverage quietly drops. I’d fail on unknown types, or at least keep an explicit allowlist and assert that nothing else appears.

  3. Low: several new SSZ fixture conversions panic instead of returning a per-test failure, for example ssz_types.rs, ssz_types.rs, and ssz_types.rs. In datatest, a panic aborts the binary and obscures which vector failed. Prefer propagating Result so malformed or oversized fixtures report cleanly.

I did not see a production consensus-logic regression in the HashTreeRoot derives themselves; the main concerns are test harness robustness and fixture compatibility.

I couldn’t run cargo test in this sandbox because Cargo/rustup needs network access and writable global cache directories.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 14, 2026

Greptile Summary

This PR adds an SSZ spec-test runner for the ethlambda-types crate that validates SSZ encode/decode round-trips and hash tree roots for all consensus container types. It also extends the existing STF spec-test runner with a compare_post_states function that performs detailed, partially-optional field-level verification of post-state values including support for label-based block-root resolution.

Confidence Score: 5/5

Safe to merge — all findings are P2 style/consistency suggestions with no correctness impact on production code.

The only finding is a P2 inconsistency in iteration bounds between two bitlist checks in test infrastructure; it does not affect production logic or cause data loss. All other changes (new derives, new test runners, Cargo.toml update) are clean and well-structured.

crates/blockchain/state_transition/tests/stf_spectests.rs — the justified_slots vs justifications_validators iteration bound inconsistency warrants a quick look before the fixture suite grows.

Important Files Changed

Filename Overview
crates/common/types/tests/ssz_spectests.rs New SSZ spec-test runner that validates encode/decode round-trips and hash tree roots for all consensus container types using datatest-stable.
crates/common/types/tests/ssz_types.rs Fixture types for SSZ spec tests; mirrors test-fixtures crate (which can't be reused due to circular dependency) with SSZ-native boolean bitlist representation for justified_slots.
crates/blockchain/state_transition/tests/stf_spectests.rs Adds compare_post_states with thorough optional field checks; minor inconsistency in iteration bounds between justified_slots and justifications_validators comparisons.
crates/blockchain/state_transition/tests/types.rs New PostState struct with deny_unknown_fields and many optional fields for partial post-state assertion; well-designed for incremental fixture coverage.
crates/common/types/src/attestation.rs Adds HashTreeRoot derive to SignedAttestation and SignedAggregatedAttestation, required by the new SSZ spec-test runner's run_typed_test constraint.
crates/common/types/Cargo.toml Adds datatest-stable 0.3.3 as a dev-dependency and registers the new ssz_spectests integration test binary with harness = false.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[JSON fixture file] --> B[SszTestVector::from_file]
    B --> C{For each test case}
    C --> D{fixture_format == ssz?}
    D -- No --> E[Return Err: unsupported format]
    D -- Yes --> F[run_ssz_test]
    F --> G{Match type_name}
    G -- known type --> H[run_typed_test<F, D>]
    G -- unknown type --> I[Skip with println]
    H --> J[1: JSON → fixture type F → domain type D]
    J --> K[2: D::to_ssz == expected_bytes?]
    K -- No --> L[Return Err: encoding mismatch]
    K -- Yes --> M[3: from_ssz_bytes → re-encode == expected_bytes?]
    M -- No --> N[Return Err: round-trip mismatch]
    M -- Yes --> O[4: hash_tree_root == expected_root?]
    O -- No --> P[Return Err: root mismatch]
    O -- Yes --> Q[Test passes]
Loading

Comments Outside Diff (1)

  1. crates/blockchain/state_transition/tests/stf_spectests.rs, line 227-258 (link)

    P2 Inconsistent iteration bounds between bitlist checks

    justified_slots iterates 0..justified_slots.data.len() (expected length), giving "check only the first N bits" semantics — extra bits in the actual state are silently ignored. justifications_validators iterates 0..post_state.justifications_validators.len() (actual state length), giving exact-equality semantics — if the state has more or fewer bits than justifications_validators.data, the Vec length differs and the comparison always fails even when all fixture-specified bits match.

    Pick one consistent strategy: for partial/prefix checking use the expected data length throughout; for exact-equality checking use the state length throughout (or add a length assertion separately). The current mismatch means the two fields will behave differently under any fixture where the lengths differ.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/blockchain/state_transition/tests/stf_spectests.rs
    Line: 227-258
    
    Comment:
    **Inconsistent iteration bounds between bitlist checks**
    
    `justified_slots` iterates `0..justified_slots.data.len()` (expected length), giving "check only the first N bits" semantics — extra bits in the actual state are silently ignored. `justifications_validators` iterates `0..post_state.justifications_validators.len()` (actual state length), giving exact-equality semantics — if the state has more or fewer bits than `justifications_validators.data`, the `Vec` length differs and the comparison always fails even when all fixture-specified bits match.
    
    Pick one consistent strategy: for partial/prefix checking use the expected data length throughout; for exact-equality checking use the state length throughout (or add a length assertion separately). The current mismatch means the two fields will behave differently under any fixture where the lengths differ.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/blockchain/state_transition/tests/stf_spectests.rs
Line: 227-258

Comment:
**Inconsistent iteration bounds between bitlist checks**

`justified_slots` iterates `0..justified_slots.data.len()` (expected length), giving "check only the first N bits" semantics — extra bits in the actual state are silently ignored. `justifications_validators` iterates `0..post_state.justifications_validators.len()` (actual state length), giving exact-equality semantics — if the state has more or fewer bits than `justifications_validators.data`, the `Vec` length differs and the comparison always fails even when all fixture-specified bits match.

Pick one consistent strategy: for partial/prefix checking use the expected data length throughout; for exact-equality checking use the state length throughout (or add a length assertion separately). The current mismatch means the two fields will behave differently under any fixture where the lengths differ.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "test: add missing checks to STF spec-tes..." | Re-trigger Greptile

Base automatically changed from compat/annotated-validators-quickstart-format to poseidon1-migration April 14, 2026 20:07
Base automatically changed from poseidon1-migration to devnet4-phase4-network April 14, 2026 20:07
Base automatically changed from devnet4-phase4-network to devnet4 April 14, 2026 20:08
@MegaRedHand MegaRedHand force-pushed the ssz-genesis-justification-spectests branch from 299f340 to 978fc84 Compare April 14, 2026 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant