huggingface / trl Public

Notifications You must be signed in to change notification settings
Fork 2.7k
Star 18.1k

Code
Issues 549
Pull requests 141
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: huggingface/trl

Labels 37 Milestones 0

New pull request New

141 Open 3,032 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Align KTO with DPO: Move completion assembly from _prepare_dataset to data collator

#5632 opened Apr 23, 2026 by albertvillanova Member

Loading…

Align and update doc-builder commit hash in CI GitHub Actions

#5631 opened Apr 23, 2026 by albertvillanova Member

Loading…

Add doc-builder style check to pre-commit and CI

#5630 opened Apr 23, 2026 by albertvillanova Member

Loading…

Use PreTrainedTokenizerBase for tokenizer type hints

#5629 opened Apr 22, 2026 by qgallouedec Member

Loading…

Add Cohere training chat template (#5471)

#5627 opened Apr 22, 2026 by dschulmeist

Loading…

4 tasks done

Remove forward_masked_logits

#5626 opened Apr 22, 2026 by qgallouedec Member

Loading…

feat: Add generation_kwargs support to LogCompletionsCallback and Wea…

#5625 opened Apr 22, 2026 by LhaseParth2610

Loading…

4 of 8 tasks

Fix entropy calculation in SFT

#5620 opened Apr 22, 2026 by qgallouedec Member

Loading…

Renaming of internal variables: async_reward_X to async_X

#5616 opened Apr 21, 2026 by qgallouedec Member

Loading…

Upload testing suite for DistillationTrainer

#5615 opened Apr 21, 2026 by cmpatino Collaborator

Loading…

3 of 8 tasks

Add LoRA support for AsyncGRPO

#5610 opened Apr 21, 2026 by jonahsamost

Loading…

2 of 8 tasks

experimental: Self-Distillation Zero

#5609 opened Apr 20, 2026 by LeonEricsson Collaborator

Loading…

1 of 8 tasks

support prefetch/prefetch_depth for async GRPO for ~5% speedups

#5602 opened Apr 20, 2026 by winglian Contributor

Loading…

1 of 8 tasks

fix(distillation): reverse-KL server path NaN on variable completion length

#5594 opened Apr 19, 2026 by k1064190

Loading…

3 of 8 tasks

Fix nested vocab_size for DistillationTrainer and GOLDTrainer

#5592 opened Apr 19, 2026 by Beichen-Ma

Loading…

2 of 8 tasks

feat: add TargetPO trainer

#5591 opened Apr 18, 2026 by JeanKaddour • Draft

4 of 8 tasks

Add tiny Qwen3-4B-Instruct-2507

#5586 opened Apr 17, 2026 by qgallouedec Member

Loading…

Chunked cross-entropy loss for SFT (up to –50% VRAM)

#5575 opened Apr 17, 2026 by qgallouedec Member

Loading…

Add training chat template for Qwen3-2507

#5574 opened Apr 16, 2026 by SwayamInSync Contributor

Loading…

refactor: self distillation trainers (sdpo/sdft/...)

#5573 opened Apr 16, 2026 by LeonEricsson Collaborator

Loading…

2 of 8 tasks

Fix empty-target self-distillation loss to stay connected to model graph

#5572 opened Apr 16, 2026 by walawalagoose

Loading…

3 of 8 tasks

Improve BrowserGym examples for latest OpenEnv version

#5568 opened Apr 16, 2026 by sergiopaniego Member

Loading…

8 tasks

Set _tokenizer attribute in experimental trainers

#5566 opened Apr 16, 2026 by albertvillanova Member

Loading…

Revert VLM support in parse_response

#5561 opened Apr 15, 2026 by qgallouedec Member • Draft

Accept processor in get_training_chat_template

#5560 opened Apr 15, 2026 by qgallouedec Member

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!