Codestin Search App

posted an update 4 days ago

Post

1752

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update 4 days ago

Post

766

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

updated a dataset 6 days ago

trl-lib/documentation-images

Viewer • Updated 6 days ago • 9 • 75.4k

sergiopaniego

posted an update 7 days ago

Post

2221

@Qwen released their new small and dense VLMs (Qwen3-VL).

They're incredibly capable and one of my all-time favourite VLMs.

🤗 We’ve prepared some resources to help you get started.

> Fine-tune Qwen3-VL-4B with SFT or GRPO (free Colab notebooks):
> SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb
> GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb

> Compare object detection vs. Moondream3:
sergiopaniego/vlm_object_understanding

> Fine-tune from the CLI using TRL:
https://github.com/kashif/Qwen3-VL/blob/trl-sft/qwen-vl-finetune/README.md#trl-based-training-single-gpu

lvwerra

authored a paper 8 days ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published 12 days ago • 29

sergiopaniego

posted an update 12 days ago

Post

1422

Super nice intro to fine-tuning with TRL, just dropped by @google (runs free on Colab)!

They use SFT + QLoRA to fine-tune the tiny Gemma 3 270M model for emoji generation

Here’s what the fine-tuned model generates for the prompt: “I'm learning to tweet” → 🐦🗣💻

Colab: https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Demos/Emoji-Gemma-on-Web/resources/Fine_tune_Gemma_3_270M_for_emoji_generation.ipynb
Try it out: google/emoji-gemma
Learn more: https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/

sergiopaniego

posted an update 15 days ago

Post

2367

Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚡, scale to multi-GPU/multi-node!

🧑‍🍳 recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training

1 reply

·

sergiopaniego

posted an update 16 days ago

Post

2864

A few days ago, Thinking Machines Lab released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide!

https://huggingface.co/docs/trl/main/en/lora_without_regret

sergiopaniego

posted an update 21 days ago

Post

572

Want to deploy open models using vLLM as the inference engine?
We just released a step-by-step guide on how to do it with @huggingface Inference Endpoints, now available in the vLLM docs.

let the gpus go brrr

https://docs.vllm.ai/en/latest/deployment/frameworks/hf_inference_endpoints.html

sergiopaniego

posted an update 26 days ago

Post

477

You need to try this tool! 🫡

My colleague @Molbap built an interactive HF Space to explore the modular support of open models in transformers over time

👀 You’ll spot things like 🦙 llama defining many models or which ones could be modular next

Try it: Molbap/transformers-modular-refactor

sergiopaniego

posted an update 26 days ago

Post

468

How fast can you create an endpoint in Hugging Face Inference Endpoints with a new model + vLLM to deploy a state-of-the-art OCR model?

Let’s break it down step by step.

1️⃣ Create your endpoint
Go to Hugging Face Endpoints → + NEW
Select Deploy from Hub → rednote-hilab/dots.ocr → Configure 🛠️

2️⃣ Configure hardware & container
Pick hardware: AWS/GPU/L4 ⚡
Set container: vLLM 🐇
Click Create ✅

3️⃣ Update endpoint settings
Container: Container URI: vllm/vllm-openai:nightly → Update
Advanced: add flag --trust-remote-code → Update ⚠️

4️⃣ Run inference
Download the script 📝: ariG23498/useful-scripts
Set your HF_TOKEN and update base_url in the script.
Run it. ✅

Your OCR model is now live via HF Inference Endpoints!

sergiopaniego

posted an update 27 days ago

Post

3439

💥 Tons of new material just landed in the smol-course! 🧑‍💻

> evaluation
> alignment
> VLMs
> quizzes
> assignments!
> certificates!👩‍🎓

go learn! 👉 https://huggingface.co/learn/smol-course/unit0/1

1 reply

·

sergiopaniego

posted an update 30 days ago

Post

1388

This summer TRL leveled up for multimodal alignment 🌞

✅ New VLM alignment methods (MPO, GRPO, GSPO)
✅ Extended RLOO & Online DPO for VLMs
✅ Native SFT support
✅ Ready-to-use training scripts

🔗 https://huggingface.co/blog/trl-vlm-alignment

sergiopaniego

posted an update about 1 month ago

Post

557

You can now use any open LLM as your coding assistant in VS Code with the @huggingface Provider for GitHub Copilot Chat.

Just pick your fav open model and start building!

Vibe-coding is all you need!?

learn more: https://huggingface.co/docs/inference-providers/en/guides/vscode

1 reply

·

qgallouedec

updated a Space about 1 month ago

3

Trackio

🚀

Display tracking information

sergiopaniego

posted an update about 1 month ago

Post

441

Training long-context LLMs is getting easier!

TRL now supports Context Parallelism (CP), letting you scale sequences across multiple GPUs, even multi-node setups, seamlessly 💆
Combine TRL and accelerate, and you can run it effortlessly!

With 8 GPUs, CP enables 300k+ token sequences while keeping throughput reasonable.
Works for both full fine-tuning and LoRA, unlocking contexts that used to hit OOM 📈

Check out the full guide here 👉 https://huggingface.co/docs/trl/main/en/distributing_training#context-parallelism

If you want to learn more about Context Parallelism, check out the Ultrascale Playbook 👉 nanotron/ultrascale-playbook

sergiopaniego

posted an update about 1 month ago

Post

344

Thinking about learning the keys to post-training LLMs? 🧐

We just updated and released the smol course: the fastest track to mastering fine-tuning large language models. Free, hands-on, up-to-date, and comes with a certificate! 🫰

What you’ll get:
📖 Instruction tuning & preference alignment
🧑‍💻 Hands-on projects with TRL & Transformers
🏆 Challenges & community projects
🎓 Certificate of completion

go: hf.co/learn/smol-course

sergiopaniego

posted an update about 1 month ago

Post

4272

gpt-oss was possible thanks to new engineering efforts in 🤗 transformers. We just dropped a blog covering them:

- Kernels from the Hub
- MXFP4 Quantization
- Tensor & Expert Parallelism
- Dynamic Sliding Window & Cache
- Continuous Batching & Paged Attention

Grab a coffee & dive in! ☕️

https://huggingface.co/blog/faster-transformers

sergiopaniego

posted an update about 1 month ago

Post

341

TRL Jobs is the quickest way to start training with TRL on Hugging Face infra, straight from the CLI 🏭

No setup hassle:
✅ Ships with 19 optimized configs, built-in trackio tracking, and support for models up to 32B.

Try it out 👉 https://github.com/huggingface/trl-jobs

qgallouedec

updated a dataset about 2 months ago

trl-lib/documentation-images

Viewer • Updated 6 days ago • 9 • 75.4k

TRL

AI & ML interests

Recent Activity

trl-lib/documentation-images

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Trackio

trl-lib/documentation-images

AI & ML interests

Recent Activity

Team members 10

trl-lib's activity

Trackio