VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning Paper • 2504.06958 • Published 7 days ago • 9
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment Paper • 2412.19326 • Published Dec 26, 2024 • 18