feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries #1937

CPerezz · 2025-12-17T15:41:41Z

This PR is a non-deployment-required refactor of #1936 such that we can discuss what is the best approach by comparing the 2 of them.

Summary

Add benchmark tests for worst-case depth attacks on Ethereum state and account tries
Implement AttackOrchestrator contract for efficient batched SSTORE attacks via CREATE2 address derivation
Add Verifier contract for post-attack storage verification
Include pre-mined CREATE2 data for storage depths 9-10 and account depths 3-5

Description

This PR introduces benchmark tests that measure the worst-case performance impact when modifying deep storage slots across thousands of contracts. The attack pattern:

Pre-deployed contracts with deep storage tries maximize trie traversal costs
CREATE2-based addressing enables deterministic contract addresses without on-chain deployment during test execution
Optimized batched attacks using AttackOrchestrator executes up to 1,980 attacks per transaction at 16M gas limit
Verifier contract confirms attack success by checking the deepest storage slot values

Key implementation details:

Gas calculations empirically measured: ~8,050 gas per attack (2,750 overhead + 5,300 forwarded)
Uses Nick's deterministic deployer (0x4e59...956c) for CREATE2 deployments
Supports configurable storage depth (9-10) and account depth (3-5) via pre-mined JSON files

Deployment setup and instructions available at:
https://gist.github.com/CPerezz/44d521c0f9e6adf7d84187a4f2c11978

Test Plan

Run uv run execute remote with --gas-benchmark-values 60 to verify 7,104 contracts attacked successfully
Verify storage values changed from 0x01 to 0x2a (42) via Verifier contract
Run on production-grade client to measure actual performance impact

cc: @LouisTsai-Csie @marioevz

…um#1842) Co-authored-by: spencer <spencer.tb@ethereum.org>

This PR introduces comprehensive benchmarks to test Ethereum clients under worst-case scenarios involving extremely deep state and account tries. The attack scenario: - Pre-deployed contracts with deep storage tries (depth=9) maximizing traversal costs - CREATE2-based deterministic addressing for reproducible benchmarks - AttackOrchestrator contract that batches up to 2,510 attacks per transaction - Tests measure state root recomputation impact when modifying deep slots Key components: - depth_9.sol, depth_10.sol: Contracts with deep storage tries - s9_acc3.json: Pre-computed CREATE2 addresses and auxiliary accounts (15k contracts) - AttackOrchestrator.sol: Optimized attack coordinator (3,650 gas per attack) - deep_branch_testing.py: EEST test harness for pre-deployed contracts - README.md: Complete documentation and setup instructions Performance optimizations: - Reduced gas forwarding from 50k to 3,650 per attack (8.3x throughput increase) - MAX_ATTACKS_PER_TX increased from 303 to 2,510 - Precise EVM opcode cost analysis with safety margins - Read init_code_hash directly from JSON instead of recompiling Deployment setup and instructions available at: https://gist.github.com/CPerezz/44d521c0f9e6adf7d84187a4f2c11978 This benchmark helps identify performance bottlenecks in state trie handling and validates client implementations under extreme depth conditions.

- Implements new `storage-trie-brancher` scenario that replicates Python `deploy_deep_branches.py` functionality (See: ethereum/execution-specs#1937) - Enables deployment of contracts designed to create worst-case storage trie depth scenarios (See https://github.com/CPerezz/worst_case_miner/settings) - Automatically handles Nick's factory deployment if not present on network.

The attack() call was forwarding only 3650 gas, which is insufficient for SSTORE operations on cold storage slots. SSTORE requires: - 2100 gas for cold slot access - 2900 gas for zero-to-nonzero write - Plus dispatch overhead (~200 gas) Updated to forward 5300 gas to ensure SSTORE succeeds.

Adds a minimal Verifier contract that checks if a target contract's deepest storage slot was updated to the expected attack value. This enables the test to verify attack success without expensive post-state checks on all attacked contracts. The verify() function calls getDeepest() on the target and compares the returned value against the expected attack value.

… gas Major refactor of the depth benchmark test for execute mode: - Remove stubs dependency; derive contract addresses directly from init_code_hash + Nick's deployer using CREATE2 formula - Deploy AttackOrchestrator and Verifier as part of test execution - Dynamically compute NUM_CONTRACTS based on gas_benchmark_value - Add verification transaction at end of block to confirm attack success - Fix gas constants based on empirical measurements: - GAS_PER_ATTACK: 8014 -> 8050 (measured ~8042) - MAX_ATTACKS_PER_TX: 1990 -> 1980 (safety margin) - TX_OVERHEAD: 22900 -> 22600 (more accurate) The previous gas constants caused all attack transactions to run out of gas, as the 28 gas/attack shortfall compounded over 1990 attacks to ~55k gas deficit.

- Embed AttackOrchestrator and Verifier bytecode directly in Python - Add download_mined_asset() to fetch JSON/SOL files from GitHub - Cache downloaded files locally in .cache/ directory - Remove local .sol and .json asset files (now downloaded on demand) - Update test parameters to use (10, 6) available from GitHub - Add gist reference for contract sources Contract sources: https://gist.github.com/CPerezz/8686da933fa5c045fbdf7c31e20e6c71 Mined assets: https://github.com/CPerezz/worst_case_miner/tree/master/mined_assets

CPerezz · 2026-01-02T10:01:50Z

Notice 2fcf12b actually introduces Add download_mined_asset() to fetch JSON/SOL files from GitHub. So that instead of uploading here all the artifacts, the scenario will download them.

codecov · 2026-01-02T10:58:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.33%. Comparing base (d1e7e6b) to head (11c817c).
⚠️ Report is 30 commits behind head on forks/amsterdam.

Additional details and impacted files

@@                 Coverage Diff                 @@
##           forks/amsterdam    #1937      +/-   ##
===================================================
+ Coverage            83.87%   86.33%   +2.46%     
===================================================
  Files                  402      538     +136     
  Lines                25101    34557    +9456     
  Branches              2285     3222     +937     
===================================================
+ Hits                 21053    29835    +8782     
- Misses                3609     4148     +539     
- Partials               439      574     +135

Flag	Coverage Δ
unittests	`86.33% <ø> (+2.46%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Remove unused ATTACK_SELECTOR constant - Extract magic numbers to named constants (gas limits, fees, etc.) - Add zero contracts validation to prevent edge case bugs - Fix unused fork parameter (rename to _fork) - Replace print warning with warnings.warn - Fix docstring math discrepancy (~2,742 not 2,750) - Fix line length issues and add proper type annotations

danceratopz and others added 2 commits December 10, 2025 13:23

docs(testing-cli-consume): fix hive simulator names post-weld (ethere…

cc82819

…um#1842) Co-authored-by: spencer <spencer.tb@ethereum.org>

SamWilsn mentioned this pull request Dec 17, 2025

feat: add depth-based worst-case attack benchmarks for execute mode #1936

Closed

CPerezz marked this pull request as ready for review December 23, 2025 09:19

CPerezz mentioned this pull request Dec 23, 2025

feat: add storage-trie-brancher scenario for deep storage trie attacks ethpandaops/spamoor#159

Open

LouisTsai-Csie mentioned this pull request Dec 29, 2025

Gas Lighting Committee #10, Dec 23, 2025 ethpandaops/gas-lighting-tracker#27

Open

CPerezz added 3 commits January 1, 2026 23:12

CPerezz force-pushed the feat/depth-bench-without-deploys branch from 4bae2a0 to 5b17141 Compare January 1, 2026 22:20

CPerezz changed the title ~~Feat/depth bench without deploys~~ feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries Jan 1, 2026

CPerezz added 2 commits January 2, 2026 11:12

style: run ruff format on deep_branch_testing.py

61a0d75

fix: add mypy type annotations for deep_branch_testing.py

2ef6b13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries #1937

feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries #1937

Uh oh!

CPerezz commented Dec 17, 2025 •

edited

Loading

Uh oh!

CPerezz commented Jan 2, 2026

Uh oh!

codecov bot commented Jan 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries #1937

Are you sure you want to change the base?

feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries #1937

Uh oh!

Conversation

CPerezz commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Description

Test Plan

Uh oh!

CPerezz commented Jan 2, 2026

Uh oh!

codecov bot commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CPerezz commented Dec 17, 2025 •

edited

Loading

codecov bot commented Jan 2, 2026 •

edited

Loading