MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports Paper • 2505.11733 • Published 23 days ago • 6
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning Paper • 2505.12081 • Published 22 days ago • 17
Faster Video Diffusion with Trainable Sparse Attention Paper • 2505.13389 • Published 20 days ago • 35
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published 23 days ago • 57
InstanceGen: Image Generation with Instance-level Instructions Paper • 2505.05678 • Published May 8 • 7
MatTools: Benchmarking Large Language Models for Materials Science Tools Paper • 2505.10852 • Published 24 days ago • 6
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking Paper • 2505.08581 • Published 26 days ago • 9
QuXAI: Explainers for Hybrid Quantum Machine Learning Models Paper • 2505.10167 • Published 25 days ago • 7
AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection Paper • 2505.09926 • Published 25 days ago • 6
Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt Paper • 2505.09264 • Published 26 days ago • 5
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning Paper • 2505.10320 • Published 24 days ago • 22
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning Paper • 2505.08617 • Published 26 days ago • 41
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis Paper • 2505.09358 • Published 25 days ago • 24
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published 25 days ago • 93
Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines Paper • 2504.21475 • Published Apr 30 • 7
Fast Text-to-Audio Generation with Adversarial Post-Training Paper • 2505.08175 • Published 27 days ago • 22
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder Paper • 2505.07916 • Published 27 days ago • 124