forked from kmccleary3301/nested_learning
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or requestexecution-boardExecution board ticket set for paper alignmentExecution board ticket set for paper alignmentquality-gateHas explicit acceptance criteria and test gatesHas explicit acceptance criteria and test gatesrunpodRunPod infra and training execution tasksRunPod infra and training execution tasks
Milestone
Description
Purpose
Make RunPod execution reproducible, interruption-safe, and auditable for this repo.
Mandatory Reading (blocking)
First comment must summarize:
reports/NL_IMPLEMENTATION_ORACLE.mdsection 6.3.3 and 6.3.4docs/FSDP_SCALING_GUIDE.mddocs/release_checklist.mddocs/env_matrix.md
Also include links reviewed from RunPod docs in the first comment.
Required Code Anchors
scripts/compute/- training entrypoints (
train.py,train_fsdp.py,train_deepspeed.py) - docs under
docs/
Scope
- Add concrete RunPod playbook:
- pod create settings
- persistent storage conventions
- SSH + transfer commands
- checkpoint frequency guidance for spot/on-demand
- forced stop/resume drill
Deliverables
docs/runpod_execution.md- helper scripts for setup/sync/checkpoint validation
- runbook used by Phase 5: Run RunPod pilot ablation matrix (5k/25k) for paper-fidelity paths #11 and Phase 5: Execute mid/target faithful runs and publish final alignment report #12
Acceptance Criteria
- Fresh pod can run smoke end-to-end using docs only.
- Resume drill validated with evidence.
- First issue comment contains mandatory reading summary.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestexecution-boardExecution board ticket set for paper alignmentExecution board ticket set for paper alignmentquality-gateHas explicit acceptance criteria and test gatesHas explicit acceptance criteria and test gatesrunpodRunPod infra and training execution tasksRunPod infra and training execution tasks