EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes Paper • 2507.11407 • Published 18 days ago • 51
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others • 14 days ago • 47
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation Paper • 2504.06225 • Published Apr 8 • 2
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 25 days ago • 602
ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention Paper • 2507.01004 • Published Jul 1 • 10
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published about 1 month ago • 57
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2 • 51
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30 • 84
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published 30 days ago • 21
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Paper • 2505.24298 • Published May 30 • 27
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis Paper • 2506.06276 • Published Jun 6 • 22
Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better Paper • 2506.09040 • Published Jun 10 • 35
pLSTM: parallelizable Linear Source Transition Mark networks Paper • 2506.11997 • Published Jun 13 • 10
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 208
CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models Paper • 2506.07463 • Published Jun 9 • 10
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12, 2024 • 68
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy Paper • 2506.13284 • Published Jun 16 • 24
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16 • 260