Submitted by fuvty 82 Cache-to-Cache: Direct Semantic Communication Between Large Language Models Tsinghua-NICS-EFC 34 5
Submitted by forde450 66 Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer inclusionAI 75 3
Submitted by taesiri 47 Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Alpha-VLLM 769 2
Submitted by dcml0714 34 SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models · 10 authors 2
Submitted by taesiri 32 MATRIX: Mask Track Alignment for Interaction-aware Video Generation · 8 authors 25 3
Submitted by zoeyuchao 31 RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training RLinf 501 2
Submitted by whyu 22 Artificial Hippocampus Networks for Efficient Long-Context Modeling ByteDance Seed 67 2
Submitted by amphora 22 Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought KO-REAson 2
Submitted by imsheriff 21 The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP University of California, Los Angeles 2
Submitted by huggingaaaaa 20 Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Tsinghua University 4 2
Submitted by kazemnejad 19 The Markovian Thinker Mila – Quebec Artificial Intelligence Institute 140 2
Submitted by tangzhy 19 CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling · 12 authors 2
Submitted by ZetangForward 18 Revisiting Long-context Modeling from Context Denoising Perspective Soochow University 2 3
Submitted by FSCCS 16 OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot Westlake University 30 2
Submitted by XinXuNLPer 13 When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation McAuley-Lab 3 2
Submitted by MingyuLiu 12 StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation Zhejiang University 3
Submitted by Chenfei-Liao 11 Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods · 13 authors 2
Submitted by taesiri 11 TTRV: Test-Time Reinforcement Learning for Vision Language Models · 10 authors 7 2
Submitted by JimmyMa99 9 Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs · 14 authors 14 2
Submitted by talzoomanzoo 6 Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces Yonsei University 2
Submitted by XuWuLingYu 5 WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation Peking University 2
Submitted by myownskyW7 5 G^2RPO: Granular GRPO for Precise Reward in Flow Models IXCLab@Shanghai AI Lab 27 2
Submitted by taesiri 4 AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning · 17 authors 7 2
Submitted by taesiri 3 U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking · 10 authors 61 2
Submitted by EddyLuo 3 Code Agent can be an End-to-end System Hacker: Benchmarking Real-world Threats of Computer-use Agent Momoka 2
Submitted by RajveeSheth 2 Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models Lingo Research Group 1 2
Submitted by Dragongon 2 FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering · 5 authors 2
Submitted by yasNing 2 DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning Agents Didi Chuxing 2
Submitted by sam-motamed 1 TRAVL: A Recipe for Making Video-Language Models Better Judges of Physics Implausibility Institute for Computer Science, Artificial intelligence and Technology 2
Submitted by Dragongon 1 PuzzlePlex: Benchmarking Foundation Models on Reasoning and Planning with Puzzles · 9 authors 2
Submitted by Yanran21 1 D^3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection · 8 authors 5 2