AI & ML interests

None defined yet.

Recent Activity

qgallouedecΒ  updated a dataset 2 days ago
trl-lib/documentation-images
sergiopaniegoΒ  updated a dataset 5 days ago
trl-lib/documentation-images
ShirinYamaniΒ  updated a dataset 16 days ago
trl-lib/documentation-images
View all activity

sergiopaniegoΒ 
posted an update 4 days ago
view post
Post
3765
You can now supercharge your TRL training pipelines with kernels

πŸ‘· kernels is new library to load optimized compute kernels directly from the Hub

Combined with TRL, it makes you developer experience smoother & faster.

Check out the new guide to learn more! πŸ•Ί

Learn ➑️ https://huggingface.co/docs/trl/main/en/kernels_hub
sergiopaniegoΒ 
posted an update 12 days ago
view post
Post
405
It's now posible to do end-2-end ML without leaving the @huggingface Hub, by combining TRL + HF jobs + Trackio!!

🐑We just released a full guide explaining the process.

Go check it out!

πŸ“– Guide: https://huggingface.co/docs/trl/main/en/jobs_training

πŸ’‘ Reminder: HF Jobs is only available for Pro, Team, or Enterprise plans. Yet another reason to upgrade
sergiopaniegoΒ 
posted an update 27 days ago
sergiopaniegoΒ 
posted an update 28 days ago
view post
Post
405
New Zero-Shot Object Detectors in transformers! πŸ₯½

We’ve added LLMDet and MM GroundingDINO, plus a demo Space to compare them with others πŸ–ΌοΈ

Play with it: ariG23498/zero-shot-od
sergiopaniegoΒ 
posted an update 28 days ago
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
456
Latest TRL release brings major upgrades for multimodal alignment!

We dive into 3 new techniques to improve VLM post-training in our new blog:

πŸŒ‹ GRPO
🎞️ GSPO
πŸ™ MPO
βž• vLLM integration for online training w/ transformers backend\

🐑 Blog: https://huggingface.co/blog/trl-vlm-alignment
sergiopaniegoΒ 
posted an update about 1 month ago
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
3417
Want to learn how to align a Vision Language Model (VLM) for reasoning using GRPO and TRL? πŸŒ‹

πŸ§‘β€πŸ³ We've got you covered!!

NEW multimodal post training recipe to align a VLM using TRL in @HuggingFace 's Cookbook.

Go to the recipe πŸ‘‰https://huggingface.co/learn/cookbook/fine_tuning_vlm_grpo_trl

Powered by the latest TRL v0.20 release, this recipe shows how to teach Qwen2.5-VL-3B-Instruct to reason over images πŸŒ‹
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
4515
Just included example scripts for aligning models using GSPO (including VLM example) πŸ™†β€β™‚οΈπŸ™†β€β™‚οΈ

GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.

Super-easy-to-get-started example scripts below, GO run them!πŸ‘©β€πŸ’»πŸ‘©β€πŸ’»

πŸ§‘β€πŸŽ¨ Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
πŸ¦„ VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
🧩 More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
πŸ§™β€β™‚οΈ GSPO paper: Group Sequence Policy Optimization (2507.18071)
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
346
Did you miss this? πŸ‘“

πŸ§™β€β™‚οΈvLLM + transformers integration just got upgraded with direct VLM support.

Select a VLM + model_impl=transformers and play via vLLM!
sergiopaniegoΒ 
posted an update about 1 month ago
view post
Post
2687
We just released TRL v0.20 with major multimodal upgrades!

πŸ‘οΈ VLM support for GRPO (highly requested by the community!)
🎞️ New GSPO trainer (from @Qwen , released last week, VLM-ready)
πŸ™ New MPO trainer (multimodal by design, as in the paper)

πŸ“ Full release notes here: https://github.com/huggingface/trl/releases/tag/v0.20.0