Submitted by zhihou 46 Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy · 11 authors 2
Submitted by akhaliq 37 Open Deep Search: Democratizing Search with Open-source Reasoning Agents · 12 authors 3
Submitted by KennyUTC 32 LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning? · 9 authors 2
Submitted by phillipinseoul 22 Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models · 4 authors 3
Submitted by msj9817 15 GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers · 6 authors 2
Submitted by Awiny 14 BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation · 9 authors 3
Submitted by Concyclics 10 LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation · 7 authors 2
Submitted by yilunzhao 9 MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search · 4 authors 2
Submitted by aejion 9 AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset · 6 authors 2
Submitted by hahahawu 7 Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging · 10 authors 3
Submitted by Ningyu 6 ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems · 7 authors 2
Submitted by r0nn13 5 Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image · 2 authors 2
Submitted by ya-mehdi 5 Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs · 8 authors 2
Submitted by Awiny 4 Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models · 5 authors 2
Submitted by akhaliq 3 Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals · 7 authors 2
Submitted by johanobandoc 2 Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training · 10 authors 1
Submitted by Jarvis1111 2 UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis · 3 authors 2
Submitted by SteveZeyuZhang 1 PathoHR: Breast Cancer Survival Prediction on High-Resolution Pathological Images · 10 authors 2
Submitted by aadarsh-ram 1 RONA: Pragmatically Diverse Image Captioning with Coherence Relations · 3 authors 2