huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.4k
Star 16.8k

Code
Issues 536
Pull requests 83
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 34 Milestones 0

New pull request New

83 Open 2,397 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Extend CLI to orpo trainer

#4757 opened Dec 27, 2025 by murilo-cunha

Loading…

3 of 5 tasks

fix: handle None eval_dataset in example code

#4756 opened Dec 27, 2025 by ciaoyizhen

Loading…

1 of 4 tasks

perf: avoid output_hidden_states when only last_hidden_state is used

#4755 opened Dec 27, 2025 by ciaoyizhen

Loading…

2 of 5 tasks

vllm parameter passthrough for stop sequences

#4754 opened Dec 26, 2025 by kdubovikov

Loading…

Fix GRPO scale_rewards type specification to fix __post_init__ validation

#4752 opened Dec 26, 2025 by apalmas-saifh

Loading…

1 of 5 tasks

Clarify Accelerate usage in SFTTrainer documentation

#4744 opened Dec 23, 2025 by Likhita-17

Loading…

1 task done

fix minillm trainer

#4743 opened Dec 23, 2025 by t1101675

Loading…

5 tasks

[GRPOTrainer]: Agent Training Supports Async Tool Calls

#4742 opened Dec 23, 2025 by pramodith

Loading…

5 tasks done

[WIP] feat: Bidirectional masked importance sampling ratio (MIS) for IcePop

#4732 opened Dec 20, 2025 by casinca • Draft

5 tasks

Fix MiniLLM Training

#4731 opened Dec 20, 2025 by t1101675

Loading…

Up to 50% less VRAM during forward with forward_masked_logits function

#4729 opened Dec 20, 2025 by qgallouedec

Loading…

Improve PEFT integration

#4723 opened Dec 19, 2025 by qgallouedec

Loading…

Refactor vLLM generation [2/N]: Decouple rollout_func and vLLM functionalities

#4712 opened Dec 17, 2025 by albertvillanova • Draft

Refactor vLLM generation [1/N]: Extract vLLM generation

#4700 opened Dec 16, 2025 by albertvillanova

Loading…

fix: invalidate ZeRO-3 param coordinator trace in add_hooks

#4693 opened Dec 15, 2025 by roycho96

Loading…

1 of 5 tasks

feat: DeepSeek V3.2 Off-policy sequence masking

#4689 opened Dec 13, 2025 by casinca

Loading…

4 of 5 tasks

GKDTrainer: Fix return_outputs in Liger kernel path and update tests

#4688 opened Dec 13, 2025 by roycho96

Loading…

2 of 5 tasks

Update import structure

#4665 opened Dec 11, 2025 by qgallouedec

Loading…

[WIP] GRPO-inspired Online DPO refactor

#4659 opened Dec 10, 2025 by d-tiapkin • Draft

2 of 7 tasks

feature: Add RTPO Trainer

#4652 opened Dec 9, 2025 by SolarWindRider

Loading…

6 tasks done

CPOTrainer - Incorrect handling of different length chosen/rejected p…

#4639 opened Dec 8, 2025 by davmels

Loading…

Update docs landing with latest details

#4624 opened Dec 4, 2025 by sergiopaniego

Loading…

6 tasks

Add cross-tokenizer distillation support for GKD and MiniLLM trainers

#4561 opened Nov 22, 2025 by sambhavnoobcoder

Loading…

Add PSPO trust region method as alternative to clipping in GRPOTrainer

#4548 opened Nov 19, 2025 by MCDwyer

Loading…

2 of 5 tasks

Add compute_metrics parameter for GRPOTrainer

#4534 opened Nov 17, 2025 by colinzhaoxp

Loading…

Previous 1 2 3 4 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!