Submitted by MiniMax-AI 206 MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention · 127 authors 5
Submitted by schrodingers-tiger 63 Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning · 27 authors 4
Submitted by Ayanami0730 48 DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents · 5 authors 3
Submitted by shulin16 39 Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning · 10 authors 2
Submitted by shuaishuaicdp 39 Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency · 6 authors 2
Submitted by zhendch 34 Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression · 8 authors 2
Submitted by rp-yu 29 Discrete Diffusion in Large Language and Multimodal Models: A Survey · 3 authors 2
Submitted by jingyq1 25 AR-RAG: Autoregressive Retrieval Augmentation for Image Generation · 4 authors 2
Submitted by zihanliu 18 AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy · 7 authors 4
Submitted by WTNswaggy 17 PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization · 6 authors 2
Submitted by IgnoraZ 14 From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding · 4 authors 2
Submitted by LPY 10 BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models · 9 authors 2
Submitted by iwiwi 6 ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering · 6 authors 2
Submitted by pranavAL2109 4 Supernova Event Dataset: Interpreting Large Language Model's Personality through Critical Event Analysis · 2 authors 2
Submitted by mkshing 2 DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion · 2 authors 1
Submitted by Franck-Dernoncourt 2 Forecasting Time Series with LLMs via Patch-Based Prompting and Decomposition · 10 authors 2
Submitted by Franck-Dernoncourt 2 MS4UI: A Dataset for Multi-modal Summarization of User Interface Instructional Videos · 8 authors 2
Submitted by zainmujahid 2 Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts · 4 authors 2
Submitted by Taegyeonglee 2 QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety · 5 authors 2
Submitted by liujch1998 2 Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index · 5 authors 2
Submitted by Owenngt 2 SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance · 8 authors 2
Submitted by PChemGuy - AI-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns · 1 authors 2
Submitted by ChristianAzinn - Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV · 2 authors 2