Skip to content

Add Stage 1 checkpoint reuse boundary#1076

Draft
anth-volk wants to merge 1 commit into
mainfrom
agent/stage-1/pr-4-rerun-reuse-checkpoints
Draft

Add Stage 1 checkpoint reuse boundary#1076
anth-volk wants to merge 1 commit into
mainfrom
agent/stage-1/pr-4-rerun-reuse-checkpoints

Conversation

@anth-volk
Copy link
Copy Markdown
Collaborator

@anth-volk anth-volk commented May 20, 2026

Fixes #1074

Summary

  • Add checkpoint store, persisted reuse manifest, and rerun planner abstractions for Stage 1 dataset build substeps.
  • Require semantic identity manifest matches before physical checkpoint files can be reused, while preserving the existing checkpoint file layout.
  • Key persisted reuse identities by execution-unit identity_key while keeping substep_id as the public status grouping, so multiple command/script units inside one Stage 1 substep do not overwrite each other.
  • Record checkpoint/reuse decisions and reuse reasoning in substep results, status events, and output contract metadata.
  • Add Stage 1 AI-facing docs for reuse identity granularity, conditional running, checkpoint metadata, and documentation-sync expectations.

Validation

  • ruff check policyengine_us_data/build_datasets/rerun.py modal_app/data_build.py tests/unit/test_build_dataset_rerun.py tests/unit/test_build_dataset_checkpoints.py tests/unit/test_modal_data_build.py
  • ruff format --check policyengine_us_data/build_datasets/rerun.py modal_app/data_build.py tests/unit/test_build_dataset_rerun.py tests/unit/test_build_dataset_checkpoints.py tests/unit/test_modal_data_build.py
  • uv run --no-sync pytest tests/unit/test_build_dataset_checkpoints.py tests/unit/test_build_dataset_rerun.py tests/unit/test_dataset_build_stage_contract.py tests/unit/test_modal_data_build.py tests/unit/test_pipeline_doc_guards.py tests/unit/test_pipeline_docs_extractor.py
  • uv run --no-sync --with pyyaml python scripts/run_quality_guards.py
  • uv run --no-sync --with pyyaml --with pytest pytest tests/unit/test_pipeline_docs_extractor.py tests/unit/test_pipeline_doc_guards.py

@anth-volk anth-volk force-pushed the agent/stage-1/pr-3-command-substep-status branch 5 times, most recently from 454cf30 to 056c46f Compare May 21, 2026 19:33
Base automatically changed from agent/stage-1/pr-3-command-substep-status to main May 21, 2026 21:24
@anth-volk anth-volk force-pushed the agent/stage-1/pr-4-rerun-reuse-checkpoints branch 4 times, most recently from 1fc2c99 to 30c1eb8 Compare May 22, 2026 14:58
@anth-volk anth-volk force-pushed the agent/stage-1/pr-4-rerun-reuse-checkpoints branch from 30c1eb8 to 95a15b9 Compare May 22, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stage 1 rerun reuse and checkpoint adapter boundary

1 participant