RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper • 2506.08672 • Published 5 days ago • 30
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection Paper • 2505.16475 • Published 24 days ago • 2
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published 27 days ago • 26
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 170
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published Mar 29 • 18
openai/whisper-large-v3-turbo Automatic Speech Recognition • Updated Oct 4, 2024 • 3.41M • • 2.43k