Reasoning with Exploration: An Entropy Perspective Paper • 2506.14758 • Published about 1 month ago • 28
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper • 2506.08672 • Published Jun 10 • 31
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19 • 26
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published Mar 29 • 18
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning Paper • 2306.09030 • Published Jun 15, 2023 • 1