Quentin Gallouédec's picture

Hiring 💼

Quentin Gallouédec PRO

qgallouedec

huggingface

·

AI & ML interests

None yet

Recent Activity

upvoted an article 2 days ago

Transformers.js v4 Preview: Now Available on NPM!

upvoted an article 2 days ago

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

commented on an article 2 days ago

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

View all activity

Organizations

upvoted 2 articles 2 days ago

Article

Transformers.js v4 Preview: Now Available on NPM!

3 days ago

•

48

Article

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

Feb 11, 2025

•

104

upvoted 3 papers 5 days ago

Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published 7 days ago • 30

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 31

Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

Paper • 2602.05261 • Published 7 days ago • 48

upvoted an article 12 days ago

Article

Preference Tuning LLMs with Direct Preference Optimization Methods

+3

Jan 18, 2024

•

78

upvoted a collection 14 days ago

AlphaGenome

Collection of AlphaGenome models. • 5 items • Updated 15 days ago • 32

upvoted an article 16 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

Dec 1, 2025

•

296

upvoted a paper 22 days ago

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published 30 days ago • 150

upvoted a paper 23 days ago

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 90

upvoted a paper 25 days ago

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18

upvoted a paper 26 days ago

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 19

upvoted an article 27 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Dec 15, 2025

•

107

upvoted a paper 28 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 224

upvoted a paper about 1 month ago

Hermes 4 Technical Report

Paper • 2508.18255 • Published Aug 25, 2025 • 44

upvoted 3 papers about 2 months ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 153

INTELLECT-3: Technical Report

Paper • 2512.16144 • Published Dec 18, 2025 • 20

WPO: Enhancing RLHF with Weighted Preference Optimization

Paper • 2406.11827 • Published Jun 17, 2024 • 17

upvoted 2 articles about 2 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

Dec 18, 2025

•

119

Article

Shadow AI - Where are the CIOs?

Dec 19, 2025

•

31