Skip to content

feat: add GRPO validation infrastructure and LoRA checkpoint support#55

Merged
abrichr merged 3 commits intomainfrom
feat/grpo-validation-infra
Mar 17, 2026
Merged

feat: add GRPO validation infrastructure and LoRA checkpoint support#55
abrichr merged 3 commits intomainfrom
feat/grpo-validation-infra

Conversation

@abrichr
Copy link
Member

@abrichr abrichr commented Mar 17, 2026

Summary

  • Add evaluate_url and lora_checkpoint fields to GRPOConfig
  • Pass evaluate_url through rollout collector to WAALiveConfig
  • Load existing LoRA via PeftModel.from_pretrained() when lora_checkpoint is set (enables GRPO on top of SFT LoRA)
  • Update verl_backend.py error message with actionable pointers to train_verl_e2e.py and configs
  • Add 5-phase validation script (scripts/validate_grpo_waa.py): connectivity → single rollout → model inference → single training step → multi-step training
  • Add CLI entry point (scripts/run_grpo.py) for running GRPO training without writing Python

Test plan

  • All 56 existing GRPO tests pass
  • Run validate_grpo_waa.py --mock --phase 4 (mock adapter, no VM)
  • Run validate_grpo_waa.py --server-url http://VM:5000 --phase 3 against real WAA VM
  • Run phases 4-5 on GPU instance with Qwen2.5-VL-3B

🤖 Generated with Claude Code

abrichr and others added 3 commits March 16, 2026 22:08
…or GRPO training

- Add evaluate_url field to GRPOConfig for separate evaluate endpoint
- Add lora_checkpoint field to resume GRPO from existing SFT LoRA adapter
- Pass evaluate_url through rollout collector to WAALiveConfig
- Load existing LoRA via PeftModel.from_pretrained() when lora_checkpoint set
- Update verl_backend.py error message with actionable instructions
- Add 5-phase validation script (connectivity → rollout → inference → train → multi-step)
- Add CLI entry point (scripts/run_grpo.py) for running GRPO without writing Python

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@abrichr abrichr merged commit 1b8ae78 into main Mar 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant