One Missing Piece for Open-Source Reasoning Models: A Dataset to Mitigate Cold-Starting Short CoT LLMs in RL Paper • 2506.02338 • Published 11 days ago • 4
Optimizing Safe and Aligned Language Generation: A Multi-Objective GRPO Approach Paper • 2503.21819 • Published Mar 26 • 1
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues Paper • 2506.00958 • Published 13 days ago • 20
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation Paper • 2505.18842 • Published 20 days ago • 36
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents Paper • 2505.15277 • Published 24 days ago • 99
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness Paper • 2505.05026 • Published May 8 • 15
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness Paper • 2505.05026 • Published May 8 • 15
VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms Paper • 2503.14427 • Published Mar 18 • 19
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding Paper • 2411.19527 • Published Nov 29, 2024 • 10
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation Paper • 2410.13232 • Published Oct 17, 2024 • 45