ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published 3 days ago • 73
BANG: Dividing 3D Assets via Generative Exploded Dynamics Paper • 2507.21493 • Published 4 days ago • 54
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published 3 days ago • 54
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation Paper • 2507.22886 • Published 3 days ago • 8
Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision Paper • 2507.20976 • Published 5 days ago • 10
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published 3 days ago • 36