Submitted by akhaliq 50 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs · 14 authors 11
Submitted by ArthurDouillard 25 Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch · 14 authors 5
Submitted by lindsay-qu 19 MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding · 9 authors 2
Submitted by davanstrien 17 WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training · 2 authors 4
Submitted by WeiChow 17 PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding · 6 authors 3
Submitted by Yuyang-z 16 SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer · 13 authors 2
Submitted by oaishi 6 CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation · 7 authors 2