RealQA Collection Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model • 3 items • Updated Jun 3
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning Paper • 2505.14231 • Published May 20 • 51
Effective Probabilistic Time Series Forecasting with Fourier Adaptive Noise-Separated Diffusion Paper • 2505.11306 • Published May 16 • 1
Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning Paper • 2406.18254 • Published Jun 26, 2024 • 1
GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning Paper • 2504.02546 • Published Apr 3 • 1