go/oasis-node/cmd/storage: Add create and import checkpoint cmd#6454
go/oasis-node/cmd/storage: Add create and import checkpoint cmd#6454martintomazic wants to merge 5 commits intomasterfrom
Conversation
ea89ecc to
c5e2f2a
Compare
9761a3c to
0bba8dd
Compare
fe09fe6 to
f833d73
Compare
0bba8dd to
41b49b4
Compare
f833d73 to
b47eb6c
Compare
41b49b4 to
b31dfff
Compare
✅ Deploy Preview for oasisprotocol-oasis-core canceled.
|
b31dfff to
744884b
Compare
|
Works! :) The only thing that is impractical is finding corresponding runtime rounds to given consensus height and the fact that bootstrap "eats" one height as described. Finally, one should be very careful with creation/import height/rounds so that you have all relevant light history for the runtime checkpoints you are importing. |
ef92148 to
a41d394
Compare
a41d394 to
206c70e
Compare
|
Creating checkpoints from the penultimate snapshot, is dominated by the Sapphire checkpoint creation. With 6 chunker threads current projection is 5-7 hours (will update). Import is a matter of minutes. |
817bc76 to
2be35e9
Compare
|
Added unit and e2e tests, fixed empty state corner case and improved code quality. Two minor things left to discuss:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6454 +/- ##
==========================================
- Coverage 64.73% 64.56% -0.18%
==========================================
Files 699 700 +1
Lines 68246 68581 +335
==========================================
+ Hits 44179 44279 +100
- Misses 19060 19183 +123
- Partials 5007 5119 +112 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
peternose
left a comment
There was a problem hiding this comment.
When I import a consensus checkpoint, I get few lines of the following error. Afterwards, blocks execute normally.
{"caller":"grpc.go:194","err":"failed to get consensus status: failed to fetch current block: cometbft: block query failed: height 28800866 must be less than or equal to the current blockchain height 0","level":"error","method":"/oasis-core.NodeController/GetStatus","module":"grpc/internal","msg":"request failed","req_seq":15,"ts":"2026-02-24T13:00:28.934344662Z"}
2be35e9 to
06b5cc4
Compare
e9c6485 to
9633a30
Compare
Nice catch. Yes this is also how CometBFT checkpoint import works, but found a fixup regardless :). The more annoying thing that I find is that you technically cannot import a checkpoint for the latest height, so probably adding extra validation + documenting this in the command would be beneficial, instead of unexpected error. |
|
Ready for a second review. As you spotted I am "abusing" However, this command is technically not For this reason I have created my helpers (stateless), so that they can also be easily refactored, possibly moved to Let's align on the user facing API and sanity checking the inputs:
|
Will have a look. Merge after we release 26.0. |
9633a30 to
f082fcd
Compare
f082fcd to
6eb8569
Compare
| return fmt.Errorf("failed to stop target compute worker: %w", err) | ||
| } | ||
|
|
||
| // Reset the target node's state completely. Ideally we would use NoAutoStart, |
There was a problem hiding this comment.
Maybe we should start the network for 10 blocks, stop the target node, wait for 20 blocks more, stop the source node, do snapshots, start the target node, and import them afterwards.
6eb8569 to
afbc5a0
Compare
| rtState, err := srcCtrl.Roothash.GetRuntimeState(ctx, &roothash.RuntimeRequest{ | ||
| RuntimeID: KeyValueRuntimeID, | ||
| Height: candidateHeight, | ||
| }) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get runtime state for height %d: %w", candidateHeight, err) | ||
| } | ||
|
|
||
| // Pick runtime state's LastBlockHeight as the consensus checkpoint height else | ||
| // runtime light history indexer might miss authoritative light block for the | ||
| // corresponding runtime round. | ||
| cpRound := rtState.LastBlock.Header.Round | ||
| cpHeight := rtState.LastBlockHeight |
There was a problem hiding this comment.
I believe we have same issue with our checkpoint sync.
Consensus might create a checkpoint, and we would trigger checkpoint creation for the corresponding round for configured runtimes. The problem is that if this round was not created at given height, but before, indexer would skip it:
oasis-core/go/runtime/history/indexer.go
Line 357 in 59b8f32
Which would cause the corresponding runtime checkpoint sync to fail due to a missing authoritative light header. Out of scope, will open an issue.
| ) | ||
|
|
||
| // CheckpointCreateImport is the checkpoint create/import e2e scenario. | ||
| var CheckpointCreateImport scenario.Scenario = newCheckpointCreateImportImpl() |
There was a problem hiding this comment.
There was one (flaky?) outcome in the CI that I am trying to reproduce locally (might have seen it once locally before).
https://buildkite.com/oasisprotocol/oasis-core-ci/builds/16664#019d43f3-6e51-4ddc-b8a4-a0ab5da22719
{"caller":"worker.go:1155","err":"storage/database: failed to Apply: mkvs: node not found in node db","level":"error","module":"worker/storage/committee","msg":"can't apply write log","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:20:47.085392644Z"}The imported checkpoint was corrupted as storage committee worker was unable to apply next storage diff. Might be also committee worker issue:
Storage committee logs from the CI
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{state-root}","round":11,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:31.02065292Z"}
{"caller":"worker.go:1175","level":"debug","module":"worker/storage/committee","msg":"finished syncing round","round":11,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:31.02067514Z"}
{"caller":"worker.go:439","level":"debug","module":"worker/storage/committee","msg":"storage round finalized","round":11,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:31.020769116Z"}
{"caller":"worker.go:1204","last_finalized":11,"last_synced":11,"level":"debug","module":"worker/storage/committee","msg":"incoming block","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:33.737682523Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:33.73770249Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:33.737737249Z"}
{"caller":"worker.go:1175","level":"debug","module":"worker/storage/committee","msg":"finished syncing round","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:33.737816314Z"}
{"caller":"worker.go:439","level":"debug","module":"worker/storage/committee","msg":"storage round finalized","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:33.737930232Z"}
{"caller":"worker.go:1204","last_finalized":12,"last_synced":12,"level":"debug","module":"worker/storage/committee","msg":"incoming block","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:35.059190353Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:35.059224542Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{state-root}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:35.059398047Z"}
{"caller":"worker.go:1175","level":"debug","module":"worker/storage/committee","msg":"finished syncing round","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:35.05943708Z"}
{"caller":"worker.go:439","level":"debug","module":"worker/storage/committee","msg":"storage round finalized","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:35.059690354Z"}
{"caller":"worker.go:1204","last_finalized":13,"last_synced":13,"level":"debug","module":"worker/storage/committee","msg":"incoming block","round":14,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:36.462700172Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":14,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:36.462737431Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{state-root}","round":14,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:36.462816046Z"}
{"caller":"worker.go:1175","level":"debug","module":"worker/storage/committee","msg":"finished syncing round","round":14,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:36.462842958Z"}
{"caller":"worker.go:439","level":"debug","module":"worker/storage/committee","msg":"storage round finalized","round":14,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:36.463146676Z"}
{"caller":"worker.go:1204","last_finalized":14,"last_synced":14,"level":"debug","module":"worker/storage/committee","msg":"incoming block","round":15,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:37.789492905Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":15,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:37.789509283Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{state-root}","round":15,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:37.789540837Z"}
{"caller":"worker.go:1175","level":"debug","module":"worker/storage/committee","msg":"finished syncing round","round":15,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:37.789551362Z"}
{"caller":"worker.go:439","level":"debug","module":"worker/storage/committee","msg":"storage round finalized","round":15,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:37.78963392Z"}
{"caller":"worker.go:789","err":"context canceled","level":"error","module":"worker/storage/committee","msg":"checkpointer stopped","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:38.169306642Z"}
{"caller":"worker.go:1318","level":"info","module":"worker/storage/committee","msg":"stopped","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:38.169313829Z"}
{"caller":"worker.go:768","level":"info","module":"worker/storage/committee","msg":"starting","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.835974673Z"}
{"caller":"worker.go:948","genesis_round":0,"last_synced":11,"level":"info","module":"worker/storage/committee","msg":"worker initialized","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.836554655Z"}
{"caller":"worker.go:1021","level":"info","module":"worker/storage/committee","msg":"initialized","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.83663171Z"}
{"caller":"worker.go:1204","last_finalized":11,"last_synced":11,"level":"debug","module":"worker/storage/committee","msg":"incoming block","round":16,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.836767167Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.83698815Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837095354Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837161045Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":14,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837259592Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":15,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837338432Z"}
{"awaiting_retry":"outstanding_mask{state-root}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":16,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837388739Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837444513Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=14 type=io-root hash=7fa1c8d40fdd82e2c0af8ffd3009890a4d5cc1109e1b36789b7fd283a95bf07e>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=14 type=io-root hash=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837508641Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=14 type=state-root hash=9314bf6ac6112131839cae80d8452dfaabcb0e408413b402e6573eb53eed3333>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.837837373Z"}
{"caller":"worker.go:1175","level":"debug","module":"worker/storage/committee","msg":"finished syncing round","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.838913637Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.838980235Z"}
{"caller":"worker.go:1155","err":"storage/database: failed to Apply: mkvs: node not found in node db","level":"error","module":"worker/storage/committee","msg":"can't apply write log","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.8390787Z"}
{"caller":"worker.go:439","level":"debug","module":"worker/storage/committee","msg":"storage round finalized","round":12,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839273082Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839260743Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=15 type=io-root hash=a15d4949610bf4c365b0a75368b6e79bf751c57067fd456ed34f1a5693e00ee6>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=15 type=io-root hash=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839331111Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=15 type=state-root hash=b9f885bdbbeb1504c9b0e46d186ed90af8fed7f88b0c8772058a483ee0efd9b0>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=14 type=state-root hash=9314bf6ac6112131839cae80d8452dfaabcb0e408413b402e6573eb53eed3333>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839631516Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839724966Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839799029Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.839847976Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.840478217Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.840622243Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.841242907Z"}
{"caller":"worker.go:1155","err":"storage/database: failed to Apply: mkvs: node not found in node db","level":"error","module":"worker/storage/committee","msg":"can't apply write log","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.841528927Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.841592563Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.841656563Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.842566191Z"}
{"caller":"worker.go:1155","err":"storage/database: failed to Apply: mkvs: node not found in node db","level":"error","module":"worker/storage/committee","msg":"can't apply write log","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:06:55.842681114Z"}
{"caller":"worker.go:1261","in_flight_rounds":4,"level":"debug","module":"worker/storage/committee","msg":"heartbeat","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:07:03.261447066Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:07:03.261550558Z"}
{"caller":"worker.go:401","level":"debug","module":"worker/storage/committee","msg":"calling GetDiff","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:07:03.26165467Z"}
{"awaiting_retry":"outstanding_mask{}","caller":"worker.go:1074","level":"debug","module":"worker/storage/committee","msg":"preparing round sync","outstanding_mask":"outstanding_mask{}","round":13,"runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:07:03.263252444Z"}
{"caller":"worker.go:1155","err":"storage/database: failed to Apply: mkvs: node not found in node db","level":"error","module":"worker/storage/committee","msg":"can't apply write log","new_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=13 type=state-root hash=4848fde3109b8a49e559a3a10040e25af7c5d79f8eeeecbd55d30738f3baec55>","old_root":"<Root ns=8000000000000000000000000000000000000000000000000000000000000000 version=12 type=state-root hash=963bc9c0eaf31ad37a8c9a25146d047fdef7d468b89f19a301ab28e76f84ee65>","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:07:03.26339852Z"}
{"caller":"worker.go:1261","in_flight_rounds":4,"level":"debug","module":"worker/storage/committee","msg":"heartbeat","runtime_id":"8000000000000000000000000000000000000000000000000000000000000000","ts":"2026-03-31T13:07:12.7026433Z"}^^ Worker synced from rounds 0-15, then state was reset. After that round 11 is imported, round 12 root does not change hence no need to fetch it but we still see finalized round. Finally round 13 fails.
All other local/CI runs have no such issues whatsoever, including testing this command on the real mainnet data.
Submitting more runtime transactions once the checkpoint was created, so that restarted node has to catch-up with more (and new) state, did not trigger this outcome either.
The test should be ideally hardened by also making sure the target node also syncs up to the tip of the runtime chain and not just consensus.
Prevously compute may not sync the latest round and checkpoint might fail. This is fixed by querying compute explicitly.
f831666 to
194a24f
Compare
Closes #6423