Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add doc-builder style check to pre-commit and CI
#5630 opened Apr 23, 2026 by albertvillanova Member Loading…
Use PreTrainedTokenizerBase for tokenizer type hints
#5629 opened Apr 22, 2026 by qgallouedec Member Loading…
Add Cohere training chat template (#5471)
#5627 opened Apr 22, 2026 by dschulmeist Loading…
4 tasks done
Remove forward_masked_logits
#5626 opened Apr 22, 2026 by qgallouedec Member Loading…
Fix entropy calculation in SFT
#5620 opened Apr 22, 2026 by qgallouedec Member Loading…
Renaming of internal variables: async_reward_X to async_X
#5616 opened Apr 21, 2026 by qgallouedec Member Loading…
Upload testing suite for DistillationTrainer
#5615 opened Apr 21, 2026 by cmpatino Collaborator Loading…
3 of 8 tasks
Add LoRA support for AsyncGRPO
#5610 opened Apr 21, 2026 by jonahsamost Loading…
2 of 8 tasks
experimental: Self-Distillation Zero
#5609 opened Apr 20, 2026 by LeonEricsson Collaborator Loading…
1 of 8 tasks
support prefetch/prefetch_depth for async GRPO for ~5% speedups
#5602 opened Apr 20, 2026 by winglian Contributor Loading…
1 of 8 tasks
Fix nested vocab_size for DistillationTrainer and GOLDTrainer
#5592 opened Apr 19, 2026 by Beichen-Ma Loading…
2 of 8 tasks
feat: add TargetPO trainer
#5591 opened Apr 18, 2026 by JeanKaddour Draft
4 of 8 tasks
Add tiny Qwen3-4B-Instruct-2507
#5586 opened Apr 17, 2026 by qgallouedec Member Loading…
Chunked cross-entropy loss for SFT (up to –50% VRAM)
#5575 opened Apr 17, 2026 by qgallouedec Member Loading…
Add training chat template for Qwen3-2507
#5574 opened Apr 16, 2026 by SwayamInSync Contributor Loading…
refactor: self distillation trainers (sdpo/sdft/...)
#5573 opened Apr 16, 2026 by LeonEricsson Collaborator Loading…
2 of 8 tasks
Improve BrowserGym examples for latest OpenEnv version
#5568 opened Apr 16, 2026 by sergiopaniego Member Loading…
8 tasks
Set _tokenizer attribute in experimental trainers
#5566 opened Apr 16, 2026 by albertvillanova Member Loading…
Revert VLM support in parse_response
#5561 opened Apr 15, 2026 by qgallouedec Member Draft
Accept processor in get_training_chat_template
#5560 opened Apr 15, 2026 by qgallouedec Member Loading…
ProTip! Follow long discussions with comments:>50.