Submitted by kenchan0226 82 Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning · 19 authors 2
Submitted by q-rz 60 Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance · 5 authors 1
Submitted by wchengad 37 OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation · 9 authors 1
Submitted by bertjiazheng 30 SpatialLM: Training Large Language Models for Structured Indoor Modeling · 8 authors 1
Submitted by sc-bd 20 Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning · 71 authors 2
Submitted by GitBag 16 Pre-trained Large Language Models Learn Hidden Markov Models In-context · 5 authors 2
Submitted by cszy98 15 Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers · 7 authors 1
Submitted by hongyuw 13 BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation · 4 authors 1
Submitted by Hoter 12 GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition · 12 authors 1
Submitted by noystl 11 Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation · 5 authors 2
Submitted by RogerLos 10 Through the Valley: Path to Effective Long CoT Training for Small Language Models · 4 authors 1
Submitted by parshinsh 10 The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity · 6 authors 1
Submitted by ducdauge 10 Bootstrapping World Models from Dynamics Models in Multimodal Foundation Models · 5 authors 2
Submitted by ZacLiu 8 CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models · 9 authors 1
Submitted by craigwu 7 GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior · 6 authors 1
Submitted by songff 7 Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding · 7 authors 1
Submitted by Sichengmo 6 Dreamland: Controllable World Creation with Simulator and Generative Models · 6 authors 1
Submitted by MichaelR207 6 SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs · 6 authors 2
Submitted by JieRuan 6 ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists · 17 authors 2
Submitted by KaiserWhoLearns 5 What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models · 3 authors 1
Submitted by sabrieyuboglu 4 Cartridges: Lightweight and general-purpose long context representations via self-study · 11 authors 2
Submitted by xw-eric 4 Agents of Change: Self-Evolving LLM Agents for Strategic Planning · 6 authors 2
Submitted by Honghua 3 τ^2-Bench: Evaluating Conversational Agents in a Dual-Control Environment · 5 authors 1
Submitted by shuoxing 3 SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems · 12 authors 1
Submitted by RoadQAQ 3 Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions · 11 authors 1
Submitted by lesleychou 3 NetPress: Dynamically Generated LLM Benchmarks for Network Applications · 7 authors 3
Submitted by BestWishYsh 2 PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement · 7 authors 1
Submitted by ItamarZ 2 Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs · 3 authors 1
Submitted by LibraTree 2 GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization · 7 authors 1
Submitted by ZZXF 2 MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories · 6 authors 2
Submitted by marinero4972 1 CyberV: Cybernetics for Test-time Scaling in Video Understanding · 7 authors 1
Submitted by michaelchenkj 1 Improving large language models with concept-aware fine-tuning · 4 authors 1
Submitted by mchraba 1 Evaluating LLMs Robustness in Less Resourced Languages with Proxy Models · 3 authors 1
Submitted by chargoddard 1 Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit · 2 authors 1
Submitted by xw-eric 1 Hidden in Plain Sight: Probing Implicit Reasoning in Multimodal Language Models · 7 authors 1
Submitted by lizhuang144 1 EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions · 9 authors 2
Submitted by aksgupta97 - Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering · 3 authors 1
Submitted by 594zyc - Proactive Assistant Dialogue Generation from Streaming Egocentric Videos · 8 authors 2