feat: add GRPO validation infrastructure and LoRA checkpoint support by abrichr · Pull Request #55 · OpenAdaptAI/openadapt-ml

abrichr · 2026-03-17T02:13:33Z

Summary

Add evaluate_url and lora_checkpoint fields to GRPOConfig
Pass evaluate_url through rollout collector to WAALiveConfig
Load existing LoRA via PeftModel.from_pretrained() when lora_checkpoint is set (enables GRPO on top of SFT LoRA)
Update verl_backend.py error message with actionable pointers to train_verl_e2e.py and configs
Add 5-phase validation script (scripts/validate_grpo_waa.py): connectivity → single rollout → model inference → single training step → multi-step training
Add CLI entry point (scripts/run_grpo.py) for running GRPO training without writing Python

Test plan

All 56 existing GRPO tests pass
Run validate_grpo_waa.py --mock --phase 4 (mock adapter, no VM)
Run validate_grpo_waa.py --server-url http://VM:5000 --phase 3 against real WAA VM
Run phases 4-5 on GPU instance with Qwen2.5-VL-3B

🤖 Generated with Claude Code

…or GRPO training - Add evaluate_url field to GRPOConfig for separate evaluate endpoint - Add lora_checkpoint field to resume GRPO from existing SFT LoRA adapter - Pass evaluate_url through rollout collector to WAALiveConfig - Load existing LoRA via PeftModel.from_pretrained() when lora_checkpoint set - Update verl_backend.py error message with actionable instructions - Add 5-phase validation script (connectivity → rollout → inference → train → multi-step) - Add CLI entry point (scripts/run_grpo.py) for running GRPO without writing Python Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…infra

abrichr and others added 3 commits March 16, 2026 22:08

style: fix ruff formatting in config and validation script

53f80ee

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into feat/grpo-validation-…

16128a0

…infra

abrichr merged commit 1b8ae78 into main Mar 17, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add GRPO validation infrastructure and LoRA checkpoint support#55

feat: add GRPO validation infrastructure and LoRA checkpoint support#55
abrichr merged 3 commits intomainfrom
feat/grpo-validation-infra

abrichr commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 17, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant