view article Article How I Built 7 Custom Gradio Components in Just 12 Days! By elismasilva • 2 days ago • 5
view article Article Vision Language Model Alignment in TRL ⚡️ By sergiopaniego and 4 others • 7 days ago • 44
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published 23 days ago • 60
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 16 days ago • 152
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 2 items • Updated Jul 12 • 115
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 58
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 57
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) By natolambert and 3 others • Dec 9, 2022 • 316
view article Article Multi-Label Classification Model From Scratch: Step-by-Step Tutorial By Valerii-Knowledgator • Jan 8, 2024 • 46
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! By andito and 2 others • Jan 23 • 182
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5 • 281
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 40 items • Updated Jun 23 • 119
PEFT papers Collection A collection of methods that have been implemented in the 🤗 PEFT library • 12 items • Updated Jan 30, 2024 • 29
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Paper • 2409.02095 • Published Sep 3, 2024 • 37