Skip to content

chore: adjusting resources for e2e test + adding strict_scheduling#22395

Open
mrzeszutko wants to merge 1 commit intomerge-train/spartanfrom
mr/e2e-test-resources
Open

chore: adjusting resources for e2e test + adding strict_scheduling#22395
mrzeszutko wants to merge 1 commit intomerge-train/spartanfrom
mr/e2e-test-resources

Conversation

@mrzeszutko
Copy link
Copy Markdown
Contributor

@mrzeszutko mrzeszutko commented Apr 8, 2026

Summary

  • Increase CPU/memory allocation for E2E tests that spin up multiple nodes, prover nodes, or validators
  • Enable strict (CPU-aware) scheduling for E2E test execution to prevent oversubscription

Problem

All E2E tests currently run with the default CPUS=2, MEM=8g, regardless of how many processes they start. A standard test (1 Anvil + 1 Aztec Node + 1 PXE = 3 processes) fits comfortably in 2 CPUs. But P2P tests spin up 6-10+ full validator nodes, epoch tests run multiple validators plus prover nodes, and fee tests always start a prover node — all squeezed into the same 2 CPUs.

Meanwhile, GNU parallel runs up to 64 jobs concurrently (num_cpus / 2) without awareness of per-test CPU needs. It happily runs 64 tests each claiming 2 CPUs even when some actually need 6-8. This causes CPU starvation, missed sequencer timing windows, and flaky timeout failures.

Changes

1. Per-test resource overrides (get_test_resources)

Added a function that maps test file paths to appropriate CPUS/MEM based on the number of processes each test actually runs. The values were chosen by analyzing the source code of each test to count processes at peak load:

Category Tests CPUS MEM Processes at peak Examples
Standard (1 node) ~70 2 (default) 8g AN + AZ + PXE = 3 e2e_token_contract/*, e2e_deploy_contract/*
Single node + prover ~15 3 12g AN + AZ + PN + PXE = 4 e2e_fees/*, e2e_simple, e2e_epochs/epochs_multiple
Multi-validator (3-4 nodes) ~4 4 16g AN + 3-4 AZ + PXE = 5-6 e2e_epochs/epochs_simple_block_building, epochs_multi_proof
P2P medium (4 validators, no prover) ~10 4 16g AN + BS + 4 AZ = 6 e2e_p2p/duplicate_proposal_slash, rediscovery
Multi-validator + prover ~12 6 24g AN + 4-6 AZ + PN = 7-9 e2e_p2p/gossip_network, epochs_mbps*, reqresp/*
Extremely heavy 2 8 32g 10-13 processes e2e_p2p/preferred_gossip_network, add_rollup
Prover full fake 1 3 12g AN + AZ + PN + PXE = 4 e2e_prover/full (non-CI_FULL)
Prover full real 1 16 96g (unchanged) e2e_prover/full (CI_FULL)

Process abbreviations: AN = Anvil, AZ = Aztec Node (sequencer + archiver + world-state + P2P + validator), PN = Prover Node, PXE = Private eXecution Environment, BS = Bootstrap Node.

Memory formula: MEM = CPUS * 4g (same ratio as the existing default of CPUS=2/MEM=8g), which provides headroom for each process's heap, native code, and world-state trees.

2. Strict scheduling for standalone E2E test runs

Changed the test (and test_and_collect_avm_inputs) function in yarn-project/end-to-end/bootstrap.sh to use STRICT_SCHEDULING=1. This only affects standalone execution of E2E tests — i.e., when running ./bootstrap.sh test directly (e.g., grind runs, local dev, or any workflow that invokes the test function). It does not affect normal CI runs, where the Makefile calls test_cmds and feeds commands to the test engine, which uses plain parallelize without strict scheduling.

The strict scheduler:

  • Tracks available CPU cores with a semaphore
  • Only starts a test when enough cores are free to satisfy its CPUS requirement
  • Pins each test to specific cores via CPU_LIST / taskset

This prevents oversubscription in standalone runs: with 128 cores and tests requesting 2-8 CPUs each, the scheduler naturally limits concurrency to ~30-40 tests instead of the previous 64, with each test getting the cores it actually needs.

This is the same scheduler already used by benchmarks (bench function).

@mrzeszutko mrzeszutko added the ci-full Run all master checks. label Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-full Run all master checks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant