The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 29 days ago • 122
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published about 1 month ago • 70
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 459
Runtime error Qwen2.5 VL 3B Brainrot LoRA 💬 Demo Qwen2.5-vl-3b-Instruct finetune with brainrot dataset