Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

AsyncGRPOTrainer: add PEFT/LoRA support
#5896 opened May 31, 2026 by rycerzes Contributor Loading…
5 of 8 tasks
AsyncGRPOTrainer: add ProcessorMixin handling
#5895 opened May 31, 2026 by rycerzes Contributor Loading…
5 of 8 tasks
AsyncGRPOTrainer: add sampling parameters (top_p, top_k, min_p, repetition_penalty)
#5894 opened May 31, 2026 by rycerzes Contributor Loading…
5 of 8 tasks
AsyncGRPOTrainer: add model_init_kwargs support
#5893 opened May 31, 2026 by rycerzes Contributor Loading…
5 of 8 tasks
async grpo native weight sync with vllm>=0.22.0
#5892 opened May 30, 2026 by AmineDiro Member Loading…
Fix GRPO use_liger_kernel under DeepSpeed ZeRO-3
#5891 opened May 30, 2026 by kashif Collaborator Loading…
Cross-tokenizer alignment via byte offsets in GOLD trainer
#5885 opened May 29, 2026 by kashif Collaborator Loading…
4 of 8 tasks
[2/2] refactor: decoupled self distillation trainers; cleanup
#5883 opened May 29, 2026 by LeonEricsson Collaborator Loading…
8 tasks
Simplify reference model handling in GRPO/RLOO
#5877 opened May 29, 2026 by albertvillanova Member Loading…
Simplify reference model handling in DPO
#5876 opened May 29, 2026 by albertvillanova Member Loading…
Fix loss_type="chunked_nll" under DeepSpeed ZeRO-3
#5873 opened May 27, 2026 by qgallouedec Member Loading…
Removed generate_rollout_completions
#5870 opened May 27, 2026 by sergiopaniego Member Loading…
8 tasks
[1/2] refactor: decoupled self distillation trainers (sdpo, sdft, ...)
#5862 opened May 27, 2026 by LeonEricsson Collaborator Loading…
4 of 12 tasks
Handle empty conversational fields in dataset format checks
#5860 opened May 27, 2026 by emery-Xu Loading…
3 of 8 tasks
Fix Qwen3.5 vLLM weight name remapping
#5858 opened May 27, 2026 by haimianxing Loading…
5 of 6 tasks
Padding-free training in AsyncGRPO
#5854 opened May 26, 2026 by qgallouedec Member Loading…
[WIP] Chunked DPO loss (MVP)
#5853 opened May 26, 2026 by qgallouedec Member Loading…
ProTip! Updated in the last three days: updated:>2026-05-28.