fix(stdlib): prevent block hash deletion for blocks still referenced by tips#22450
fix(stdlib): prevent block hash deletion for blocks still referenced by tips#22450AztecBot wants to merge 2 commits into
Conversation
…by tips handleChainFinalized was calling deleteBlockHashesBefore(event.block.number) which could delete hashes still referenced by other tips (proposedCheckpoint, proven, etc). This caused getBlockId to throw 'Block hash not found', putting the p2p block stream into an unrecoverable error loop. Now computes a safe deletion bound as the minimum of all active tip block numbers, ensuring no referenced block hash is deleted.
|
Automatically closing this stale claudebox draft PR (no updates for 5+ days). Re-open if still needed. |
|
We hit this exact bug on our mainnet validator running v4.2.1. The node stopped attesting silently - the container kept running, archiver and HTTP API were healthy, but the sequencer entered an internal error loop at ~2 errors/second (Block hash not found for block number 46069). No crash, no obvious signal. Recovery: stop container, delete the p2p LMDB stores (p2p, p2p-archive, p2p-peers, p2p-attestation), restart. Node came back immediately. Any ETA on a release with this fix included? |
Hi @guglez looking into this |
|
relevant fix now in #23505 |
|
Hey @spalladino Is this log line enough? |
Summary
Fixes a crash loop in the p2p block stream caused by
handleChainFinalizedinL2TipsStoreBasedeleting block hashes that are still referenced by other tips.Root Cause
handleChainFinalizedcalleddeleteBlockHashesBefore(event.block.number)which could delete hashes for blocks still referenced by other tips (e.g.proposedCheckpoint,proven). WhengetL2Tips()subsequently tried to look up the hash for those tips viagetBlockId(), it threwBlock hash not found for block number N, putting the p2p block stream into an unrecoverable error loop that prevented block sync and caused the e2e_bot test to timeout.Fix
Compute a safe deletion bound as the minimum of all active tip block numbers before deleting. This ensures no block hash is deleted while it's still referenced by any tip.
Test plan
ClaudeBox log: https://claudebox.work/s/4aa87310c1ecd68c?run=1