Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation Paper β’ 2506.09991 β’ Published 8 days ago β’ 55
nvidia/Nemotron-Research-Reasoning-Qwen-1.5B Text Generation β’ Updated 14 days ago β’ 8.02k β’ 156
nvidia/Nemotron-Research-Reasoning-Qwen-1.5B Text Generation β’ Updated 14 days ago β’ 8.02k β’ 156
nvidia/Nemotron-Research-Reasoning-Qwen-1.5B Text Generation β’ Updated 14 days ago β’ 8.02k β’ 156
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper β’ 2505.24864 β’ Published 20 days ago β’ 125
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper β’ 2505.22618 β’ Published 22 days ago β’ 42
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper β’ 2505.17612 β’ Published 27 days ago β’ 78
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper β’ 2505.10610 β’ Published May 15 β’ 53