Skip to content

Conversation

@spencer-tb
Copy link
Contributor

@spencer-tb spencer-tb commented Dec 11, 2025

🗒️ Description

Main Fix

This PR fixes the extract_config bug where fork was accessed from genesis instead of PreAllocGroup:
https://github.com/ethereum/execution-specs/actions/runs/20071991321/job/57576383289

Extra Changes

Adds automatic labeling to benchmark test pre-allocation groups to make it easy to identify the main group vs separated test groups. This consideration comes from the 6 pre-allocations groups now contained in benchmark releases, due to the seperate pre-alloc groups in test_blockhash:

@pytest.mark.repricing
@pytest.mark.parametrize(
    "index,chain_length",
    [
        pytest.param(0, 256, id="genesis"),
        pytest.param(1, 256, id="block_1"),
        pytest.param(256, 256, id="block_256"),
        pytest.param(257, 256, id="current_block"),
        pytest.param(None, 256, id="random"),
    ],
)
@pytest.mark.slow("Generates long chain")
@pytest.mark.pre_alloc_group("separate", reason="Generates long chain")
def test_blockhash(
...

When fill --generate-all-formats is run for benchmark tests, pre-alloc groups are now:

  • Labeled in JSON content with "label": "main" or "label": "single_test_<name>"
  • Named with the label in filenames: 0x398fa63f55cfbafc_main.json instead of just 0x398fa63f55cfbafc.json

This makes it trivial to find the main benchmark pre-alloc group (593 tests) versus the 5 separated test_blockhash variants (1 test each).

Before:

  fixtures/blockchain_tests_engine_x/pre_alloc/
  ├── 0x398fa63f55cfbafc.json  (which one is main?)
  ├── 0x2e3463cfd832e146.json
  ├── ...

After:

  fixtures/blockchain_tests_engine_x/pre_alloc/
  ├── 0x398fa63f55cfbafc_main.json  ← Easy to identify!
  ├── 0x2e3463cfd832e146_single_test_test_blockhash.json
  ├── ...

🔗 Related Issues or PRs

#1799

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

@spencer-tb spencer-tb added C-feat Category: an improvement or new feature A-test-benchmark Area: execution_testing.benchmark and tests/benchmark labels Dec 11, 2025
genesis: FixtureHeader
alloc: Alloc
chain_id: int = 1
fork: Fork
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_config fix.

@codecov
Copy link

codecov bot commented Dec 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.87%. Comparing base (cc82819) to head (fe282fe).

Additional details and impacted files
@@             Coverage Diff              @@
##           forks/osaka    #1901   +/-   ##
============================================
  Coverage        83.87%   83.87%           
============================================
  Files              402      402           
  Lines            25101    25101           
  Branches          2285     2285           
============================================
  Hits             21053    21053           
  Misses            3609     3609           
  Partials           439      439           
Flag Coverage Δ
unittests 83.87% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@SamWilsn SamWilsn changed the base branch from forks/osaka to forks/amsterdam December 17, 2025 15:42
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the label should have not been included in the same PR as this fix because it needs more thinking it through.

Plus it is technically a breaking change, because benchmark test runners will have to be updated to be prepared for these file name changes.

# Generate a label based on test count and test IDs
# Only label benchmark tests
label = None
if self.test_ids and "tests/benchmark/" in self.test_ids[0]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems too brittle, we could collect a test object that contains more metadata instead of trying to infer information that we already lost.

if "::" in test_id:
test_name = test_id.split("::")[-1].split("[")[0]
label = f"single_test_{test_name}"
elif len(self.test_ids) > 100:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard-coded number heuristic is prone to error, as the number of tests grow, we will likely hit this at some point when we have more grouped tests.

Perhaps main should be equal to all tests that do not modify the env?

Also, if we plan to make benchmark fixtures for other forks, we are going to have a conflict, so perhaps we should include the name of the fork in the label too.

@marioevz marioevz mentioned this pull request Dec 22, 2025
8 tasks
@marioevz
Copy link
Member

We can remove the fix from this PR and merge #1947, and only leave this PR to discuss the labeling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-benchmark Area: execution_testing.benchmark and tests/benchmark C-feat Category: an improvement or new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants