Thanks to visit codestin.com
Credit goes to huggingface.co

AI & ML interests

None defined yet.

Recent Activity

sergiopaniegoΒ 
posted an update 4 days ago
view post
Post
1752
New drop! πŸ’₯ The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding
sergiopaniegoΒ 
posted an update 4 days ago
view post
Post
766
New drop! πŸ’₯ The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.



You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding
sergiopaniegoΒ 
posted an update 7 days ago
view post
Post
2221
@Qwen released their new small and dense VLMs (Qwen3-VL).

They're incredibly capable and one of my all-time favourite VLMs.

πŸ€— We’ve prepared some resources to help you get started.

> Fine-tune Qwen3-VL-4B with SFT or GRPO (free Colab notebooks):
> SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb
> GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb

> Compare object detection vs. Moondream3:
sergiopaniego/vlm_object_understanding

> Fine-tune from the CLI using TRL:
https://github.com/kashif/Qwen3-VL/blob/trl-sft/qwen-vl-finetune/README.md#trl-based-training-single-gpu
sergiopaniegoΒ 
posted an update 12 days ago
view post
Post
1422
Super nice intro to fine-tuning with TRL, just dropped by @google (runs free on Colab)!

They use SFT + QLoRA to fine-tune the tiny Gemma 3 270M model for emoji generation

Here’s what the fine-tuned model generates for the prompt: β€œI'm learning to tweet” β†’ πŸ¦πŸ—£πŸ’»

Colab: https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Demos/Emoji-Gemma-on-Web/resources/Fine_tune_Gemma_3_270M_for_emoji_generation.ipynb
Try it out: google/emoji-gemma
Learn more: https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/
sergiopaniegoΒ 
posted an update 15 days ago
view post
Post
2367
Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚑, scale to multi-GPU/multi-node!

πŸ§‘β€πŸ³ recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training
  • 1 reply
Β·
sergiopaniegoΒ 
posted an update 16 days ago
view post
Post
2864
A few days ago, Thinking Machines Lab released β€œLoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide!

https://huggingface.co/docs/trl/main/en/lora_without_regret
sergiopaniegoΒ 
posted an update 21 days ago
sergiopaniegoΒ 
posted an update 26 days ago
view post
Post
477
You need to try this tool! 🫑

My colleague @Molbap built an interactive HF Space to explore the modular support of open models in transformers over time

πŸ‘€ You’ll spot things like πŸ¦™ llama defining many models or which ones could be modular next

Try it: Molbap/transformers-modular-refactor
sergiopaniegoΒ 
posted an update 26 days ago
view post
Post
468
How fast can you create an endpoint in Hugging Face Inference Endpoints with a new model + vLLM to deploy a state-of-the-art OCR model?

Let’s break it down step by step.

1️⃣ Create your endpoint
Go to Hugging Face Endpoints β†’ + NEW
Select Deploy from Hub β†’ rednote-hilab/dots.ocr β†’ Configure πŸ› οΈ

2️⃣ Configure hardware & container
Pick hardware: AWS/GPU/L4 ⚑
Set container: vLLM πŸ‡
Click Create βœ…

3️⃣ Update endpoint settings
Container: Container URI: vllm/vllm-openai:nightly β†’ Update
Advanced: add flag --trust-remote-code β†’ Update ⚠️

4️⃣ Run inference
Download the script πŸ“: ariG23498/useful-scripts
Set your HF_TOKEN and update base_url in the script.
Run it. βœ…

Your OCR model is now live via HF Inference Endpoints!
sergiopaniegoΒ 
posted an update 27 days ago
view post
Post
3439
πŸ’₯ Tons of new material just landed in the smol-course! πŸ§‘β€πŸ’»

> evaluation
> alignment
> VLMs
> quizzes
> assignments!
> certificates!πŸ‘©β€πŸŽ“

go learn! πŸ‘‰ https://huggingface.co/learn/smol-course/unit0/1
  • 1 reply
Β·
sergiopaniegoΒ 
posted an update 30 days ago
view post
Post
1388
This summer TRL leveled up for multimodal alignment 🌞

βœ… New VLM alignment methods (MPO, GRPO, GSPO)
βœ… Extended RLOO & Online DPO for VLMs
βœ… Native SFT support
βœ… Ready-to-use training scripts

πŸ”— https://huggingface.co/blog/trl-vlm-alignment
sergiopaniegoΒ 
posted an update about 1 month ago
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
441
Training long-context LLMs is getting easier!

TRL now supports Context Parallelism (CP), letting you scale sequences across multiple GPUs, even multi-node setups, seamlessly πŸ’†
Combine TRL and accelerate, and you can run it effortlessly!

With 8 GPUs, CP enables 300k+ token sequences while keeping throughput reasonable.
Works for both full fine-tuning and LoRA, unlocking contexts that used to hit OOM πŸ“ˆ

Check out the full guide here πŸ‘‰ https://huggingface.co/docs/trl/main/en/distributing_training#context-parallelism

If you want to learn more about Context Parallelism, check out the Ultrascale Playbook πŸ‘‰ nanotron/ultrascale-playbook
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
344
Thinking about learning the keys to post-training LLMs? 🧐

We just updated and released the smol course: the fastest track to mastering fine-tuning large language models. Free, hands-on, up-to-date, and comes with a certificate! 🫰

What you’ll get:
πŸ“– Instruction tuning & preference alignment
πŸ§‘β€πŸ’» Hands-on projects with TRL & Transformers
πŸ† Challenges & community projects
πŸŽ“ Certificate of completion

go: hf.co/learn/smol-course
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
4272
gpt-oss was possible thanks to new engineering efforts in πŸ€— transformers. We just dropped a blog covering them:

- Kernels from the Hub
- MXFP4 Quantization
- Tensor & Expert Parallelism
- Dynamic Sliding Window & Cache
- Continuous Batching & Paged Attention

Grab a coffee & dive in! β˜•οΈ

https://huggingface.co/blog/faster-transformers
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
341
TRL Jobs is the quickest way to start training with TRL on Hugging Face infra, straight from the CLI 🏭

No setup hassle:
βœ… Ships with 19 optimized configs, built-in trackio tracking, and support for models up to 32B.

Try it out πŸ‘‰ https://github.com/huggingface/trl-jobs