Submitted by jt-zhang 110 SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Tsinghua University 67 4
Submitted by QbethQ 64 StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs · 7 authors 2
Submitted by DogNeverSleep 52 OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing · 12 authors 2
Submitted by MasterVito 47 Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR Tsinghua University 11 2
Submitted by DogNeverSleep 46 RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark · 26 authors 19 2
Submitted by Yuyang-z 40 SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer NVIDIA 4.56k 2
Submitted by Nicolas-BZRD 37 When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance When Does Reasoning Matter ? 3
Submitted by taesiri 29 GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts Zhejiang University 31 1
Submitted by haoranhe 28 Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards · 7 authors 20 1
Submitted by sienna223 28 EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling Beijing Academy of Artificial Intelligence 84 10
Submitted by zjuxhl 27 EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering Zhejiang University 75 2
Submitted by taesiri 23 Rolling Forcing: Autoregressive Long Video Diffusion in Real Time ARC Lab, Tencent PCG 144 3
Submitted by wenhu 23 VideoScore2: Think before You Score in Generative Video Evaluation TIGER-Lab 8 2
Submitted by wenhu 20 Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning TIGER-Lab 2
Submitted by Chuanyang-Jin 18 The Era of Real-World Human Interaction: RL from User Conversations AI at Meta 3
Submitted by XINLI1997 18 WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning · 7 authors 2
Submitted by weizechen 17 From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones · 10 authors 2
Submitted by Steven-Shaobo 16 Socratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolution alibaba-inc 13 1
Submitted by LiamLian0727 15 Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks Zhongguancun Academy 16 3
Submitted by wcy1122 14 MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech The Chinese University of Hong Kong 183 2
Submitted by Dingning 14 BRIDGE - Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation shanghai ailab 108 1
Submitted by jaeikkim 14 MMPB: It's Time for Multi-Modal Personalization AI, Big Data, and System Laboratory 3 2
Submitted by zhangboguodong 13 Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning Renmin University of China 15 2
Submitted by changdae 13 Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding University of Wisconsin-Madison 6 2
Submitted by bys0318 12 SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression Z.ai 2
Submitted by xcjthu 12 InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation OpenBMB 2
Submitted by KunlunZhu 11 Where LLM Agents Fail and How They can Learn From Failures University of Illinois at Urbana-Champaign 23 2
Submitted by MatthieuZ 11 Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective HUAWEI Noah's Ark Lab 2
Submitted by limuloo1999 11 Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time · 4 authors 1
Submitted by li-qing 10 Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation Beijing Institute for General Artificial Intelligence 17 2
Submitted by sundrops 9 GRPO-MA: Multi-Answer Generation in GRPO for Stable and Efficient Chain-of-Thought Training · 5 authors 2
Submitted by haonan3 9 From Harm to Help: Turning Reasoning In-Context Demos into Assets for Reasoning LMs · 11 authors 2
Submitted by samuelyeh 9 LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals · 3 authors 2
Submitted by JY-Young 7 Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step Fudan University 17 1
Submitted by samuelyeh 7 Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment · 2 authors 2
Submitted by han-cai 6 DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space NVIDIA 2
Submitted by guolinke 6 Hyperspherical Latents Improve Continuous-Token Autoregressive Generation · 2 authors 62 2
Submitted by yczhuang 6 AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play · 10 authors 2
Submitted by XINLI1997 6 Local Success Does Not Compose: Benchmarking Large Language Models for Compositional Formal Verification · 5 authors 2
Submitted by SugerWu 6 MultiCrafter: High-Fidelity Multi-Subject Generation via Spatially Disentangled Attention and Identity-Aware Reinforcement Learning · 7 authors 2
Submitted by fushh7 5 LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning TongyiLab 2
Submitted by jmyang 5 Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization · 5 authors 1
Submitted by zli999 4 PixelCraft: A Multi-Agent System for High-Fidelity Visual Reasoning on Structured Images Microsoft Research 6 2
Submitted by Cauthyyy 4 Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models Adobe 1
Submitted by VsonicV 4 Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning · 7 authors 170 2
Submitted by weizhoudb 4 PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation Shanghai Jiao Tong University 5 2
Submitted by charleslwang 4 MathBode: Frequency-Domain Fingerprints of LLM Mathematical Reasoning Cognitive Metrology Lab 0 2
Submitted by HwanChang0106 4 ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents Chung-Ang University 0 2
Submitted by zhongwenxu 3 Cogito, Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning Tencent 2
Submitted by taesiri 3 IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video? · 20 authors 3 1
Submitted by ZihaoZhu 3 AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models The Chinese University of Hongkong,Shenzhen 2
Submitted by lin-tan 3 TENET: Leveraging Tests Beyond Validation for Code Generation Purdue ASSET Research Group | AI-Software Synergy 2
Submitted by HelenMao 3 UniMIC: Token-Based Multimodal Interactive Coding for Human-AI Collaboration Multimedia Intelligent Processing Group in Communication University of China 3
Submitted by SongzeLi 2 Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale OpenGVLab 4 1
Submitted by xjh19972 2 ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation · 5 authors 8 2
Submitted by robinzixuan 2 RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility Northwestern University 4 2
Submitted by liboaccn 2 REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model · 8 authors 2
Submitted by compulsi0n 2 Combinatorial Creativity: A New Frontier in Generalization Abilities Spiral Works 2
Submitted by versae 1 BOE-XSUM: Extreme Summarization in Clear Language of Spanish Legal Decrees and Notifications BERTIN Project 2
Submitted by jtlicardo 1 BPMN Assistant: An LLM-Based Approach to Business Process Modeling · 3 authors 72 2
Submitted by s-jse 1 Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models Stanford Open Virtual Assistant Lab (OVAL) 1
Submitted by omidgh 1 ADAM: A Diverse Archive of Mankind for Evaluating and Enhancing LLMs in Biographical Reasoning · 5 authors 2
Submitted by Franck-Dernoncourt 1 The Photographer Eye: Teaching Multimodal Large Language Models to See and Critique like Photographers · 8 authors 1
Submitted by pranamanam - TR2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning for Discrete Diffusion Programmable Biology Group 2
Submitted by vaidehi99 - Generalized Correctness Models: Learning Calibrated and Model-Agnostic Correctness Predictors from Historical Patterns · 5 authors 2
Submitted by alemiaschi - Charting a Decade of Computational Linguistics in Italy: The CLiC-it Corpus · 8 authors 0 1
Submitted by dipta007 - Advancing Reference-free Evaluation of Video Captions with Factual Analysis · 3 authors 1