Submitted by zlatamaria 111 Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA · 9 authors 4
Submitted by abhi1nandy2 32 Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs · 3 authors 1
Submitted by russwang 31 MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning · 13 authors 1
Submitted by Shunian 29 FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion · 8 authors 2
Submitted by chenguolin 28 PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers · 7 authors 2
Submitted by lss727 19 Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning · 9 authors 1
Submitted by thomagram 18 STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis · 10 authors 1
Submitted by dcml0714 14 Audio-Aware Large Language Models as Judges for Speaking Styles · 11 authors 3
Submitted by scott-yjyang 13 Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning · 11 authors 2
Submitted by zhwang01 8 CodeContests+: High-Quality Test Case Generation for Competitive Programming · 5 authors 1
Submitted by EmetTheGolum 8 Peer-Ranked Precision: Creating a Foundational Dataset for Fine-Tuning Vision Models from DataSeeds' Annotated Imagery · 4 authors 1
Submitted by cg1177 6 Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision · 8 authors 1
Submitted by Hoyard 5 3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model · 7 authors 1
Submitted by salman-abdullah 5 MIRIAD: Augmenting LLMs with millions of medical query-response pairs · 10 authors 1
Submitted by MauroC 5 Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data · 6 authors 2
Submitted by guineapig 5 HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization · 3 authors 1
Submitted by JohnCage 4 Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward · 8 authors 1
Submitted by benshi34 3 When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration · 6 authors 1
Submitted by sy1998 3 When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding · 10 authors 1
Submitted by neildlf 3 GuideX: Guided Synthetic Data Generation for Zero-Shot Information Extraction · 4 authors 2
Submitted by DhavalPatel 1 AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance · 8 authors 2