Codestin Search App

qgallouedec · 2025-11-04T20:58:23Z

Slow tests pass locally, and GRPO training works as well:

from datasets import load_dataset
from trl import GRPOTrainer, GRPOConfig

dataset = load_dataset("trl-lib/ultrafeedback-prompt", split="train")

# Dummy reward function: count the number of unique characters in the completions
def reward_num_unique_chars(completions, **kwargs):
    return [len(c[0]["content"]) for c in completions]

trainer = GRPOTrainer(
    model="Qwen/Qwen3-0.6B",
    reward_funcs=reward_num_unique_chars,
    train_dataset=dataset,
    args=GRPOConfig(
        use_vllm=True,
    ),
)
trainer.train()

…r calls

qgallouedec · 2025-11-04T21:00:19Z

trl/scripts/vllm_serve.py

        ]
        return {"prompt_ids": prompt_ids, "completion_ids": completion_ids, "logprobs": logprobs}

+    class ChatRequest(BaseModel):


Exactly the same as generate, expect:

images are within the messages (so we drop images from args)

chat_template_kwargs argument added

qgallouedec · 2025-11-04T21:00:55Z

trl/trainer/grpo_trainer.py

-                                # FIXME: this endpoint doesn't exist in vllm_client
                                output = self.vllm_client.chat(
-                                    prompts=ordered_set_of_prompts,
+                                    messages=ordered_set_of_prompts,


I use "messages" instead of "prompt" to align with vLLM

HuggingFaceDocBuilderDev · 2025-11-04T21:01:00Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

trl/extras/vllm_client.py

albertvillanova

Thanks: solid implementation! Just some comments and minor suggestions below.

trl/extras/vllm_client.py

trl/scripts/vllm_serve.py

lewtun

LGTM with a suggestion to double-check models like Llama are not getting a double BOS token.

lewtun · 2025-11-05T07:55:36Z

tests/test_vllm_client_server.py

        for seq in completion_ids:
            assert all(isinstance(tok, int) for tok in seq)

+    def test_chat(self):


It would be good to check that the issues with double BOS tokens getting inserted have been fully resolved (e.g. for a Llama model): vllm-project/vllm#9519

@edbeeching ran into this during https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute and it has a subtle, but negative impact on the generations.

trl/scripts/vllm_serve.py

Co-authored-by: Albert Villanova del Moral <[email protected]>

albertvillanova · 2025-11-06T08:20:01Z

tests/test_vllm_client_server.py

        for seq in completion_ids:
            assert all(isinstance(tok, int) for tok in seq)

+    def test_chat(self):


@qgallouedec I'm sorry but I'm not able to run this test.
Could you please give me some hint about the environment requirements so I can run it?
Thanks! 🤗

We might have to mock the response?

feat: add chat functionality to vLLM client and server, update traine…

d544865

…r calls

qgallouedec requested review from kashif and lewtun November 4, 2025 20:58

qgallouedec commented Nov 4, 2025

View reviewed changes

Merge branch 'main' into chat-endpoint-vllm-server

d8b6bd9

kashif reviewed Nov 5, 2025

View reviewed changes

trl/extras/vllm_client.py Show resolved Hide resolved

albertvillanova approved these changes Nov 5, 2025

View reviewed changes

lewtun approved these changes Nov 5, 2025

View reviewed changes

docstring + return type + copy messages

049615a

kashif approved these changes Nov 5, 2025

View reviewed changes

qgallouedec and others added 4 commits November 5, 2025 10:59

Update trl/scripts/vllm_serve.py

b9f1de3

Co-authored-by: Albert Villanova del Moral <[email protected]>

Update trl/scripts/vllm_serve.py

923e19b

Co-authored-by: Albert Villanova del Moral <[email protected]>

comment vllm logprobs 0

f58ae89

Merge branch 'main' into chat-endpoint-vllm-server

857d35b

qgallouedec merged commit a3af2f1 into main Nov 5, 2025
11 checks passed

qgallouedec deleted the chat-endpoint-vllm-server branch November 5, 2025 18:51

albertvillanova reviewed Nov 6, 2025

View reviewed changes

Comments

Conversation

qgallouedec commented Nov 4, 2025

Uh oh!

qgallouedec Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 4, 2025

Uh oh!

Uh oh!

albertvillanova left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

lewtun Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albertvillanova Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

kashif Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants