Skip to content

feat: add dual training backend support (standalone + verl-agent)#51

Merged
abrichr merged 2 commits intomainfrom
feat/dual-training-backend
Mar 4, 2026
Merged

feat: add dual training backend support (standalone + verl-agent)#51
abrichr merged 2 commits intomainfrom
feat/dual-training-backend

Conversation

@abrichr
Copy link
Member

@abrichr abrichr commented Mar 3, 2026

Summary

  • Add backend field to GRPOConfig ("standalone" or "verl")
  • Create verl_backend.py with build_vagen_config() and train_with_verl() integration point
  • Update __init__.py with new exports and dual-backend documentation
  • Add standalone backend note to trainer.py docstring
  • No existing function signatures or behavior modified

Context

After comprehensive framework review (see decision doc), we chose verl-agent/VAGEN as the recommended training backend for multi-turn VLM desktop automation. Rather than deprecating the standalone trainer, both backends coexist for comparison.

Backend 1 (standalone): Existing trainer.py — single-GPU, episode-level rewards, no external dependencies
Backend 2 (verl): verl-agent/VAGEN — multi-GPU, GiGPO per-step credit assignment, distributed training

Test plan

  • uv run pytest tests/test_grpo.py -v — 51/51 pass
  • No existing behavior modified
  • End-to-end comparison on a live WAA task (future work)

🤖 Generated with Claude Code

abrichr and others added 2 commits March 2, 2026 22:00
Add `backend` field to GRPOConfig ("standalone" or "verl") to support
switching between training backends:

- standalone: existing trainer.py (single-GPU, episode-level rewards)
- verl: verl-agent/VAGEN integration (multi-GPU, GiGPO per-step credit)

New verl_backend.py provides build_vagen_config() to map GRPOConfig
to VAGEN-compatible config, and train_with_verl() as the integration
point (placeholder until full end-to-end is wired up).

No existing function signatures or behavior modified.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@abrichr abrichr merged commit 4419b21 into main Mar 4, 2026
4 checks passed
@abrichr abrichr deleted the feat/dual-training-backend branch March 4, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant