Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published 1 day ago • 9
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 2 days ago • 129
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Paper • 2502.14669 • Published 1 day ago • 5
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 1 day ago • 28
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper • 2502.14282 • Published 2 days ago • 12
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models Paper • 2502.14834 • Published 1 day ago • 20
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 1 day ago • 84
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 1 day ago • 84
Diverse Inference and Verification for Advanced Reasoning Paper • 2502.09955 • Published 8 days ago • 16
STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning Paper • 2502.10177 • Published 8 days ago • 5
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 8 days ago • 49
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 9 days ago • 36
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published 8 days ago • 29
Retrieval-augmented Large Language Models for Financial Time Series Forecasting Paper • 2502.05878 • Published 13 days ago • 38
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods Paper • 2502.01618 • Published 19 days ago • 9
Large Language Model Guided Self-Debugging Code Generation Paper • 2502.02928 • Published 17 days ago • 11