RLMs (Reasoning Language Models)
updated
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper
• 2503.00735
• Published • 23
START: Self-taught Reasoner with Tools
Paper
• 2503.04625
• Published • 113
R1-Searcher: Incentivizing the Search Capability in LLMs via
Reinforcement Learning
Paper
• 2503.05592
• Published • 27
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with
Reinforcing Learning
Paper
• 2503.05379
• Published • 38
21B • Updated • 149
• 391
Viewer
• Updated • 269 • 413
• 47
Text Generation
• 33B • Updated • 58.9k
• • 2.93k
Text Generation
• 8B • Updated • 279
• • 185
Text Generation
• 33B • Updated • 77
• • 156
Reinforcement Learning for Reasoning in Small LLMs: What Works and What
Doesn't
Paper
• 2503.16219
• Published • 52
predibase/Predibase-T2T-32B-RFT
33B • Updated • 10
• 20
agentica-org/DeepCoder-1.5B-Preview
Text Generation
• 2B • Updated • 87
• • 75
agentica-org/DeepCoder-14B-Preview
Text Generation
• 15B • Updated • 401
• • 681
Feature Extraction
• 8B • Updated • 2.56k
• 56
deepseek-ai/DeepSeek-R1-0528
Text Generation
• 685B • Updated • 6.15M
• • 2.45k
nvidia/Nemotron-Research-Reasoning-Qwen-1.5B
Text Generation
• 2B • Updated • 3.31k
• • 241
Video-Text-to-Text
• 9B • Updated • 19
• 23
mistralai/Magistral-Small-2506
24B • Updated • 50k
• 608
microsoft/Phi-4-mini-reasoning
Text Generation
• 4B • Updated • 56.8k
• • 230
microsoft/Phi-4-mini-flash-reasoning
Text Generation
• 4B • Updated • 1.09k
• 278
microsoft/Phi-4-reasoning
Text Generation
• 15B • Updated • 9.25k
• 227
osmosis-ai/Osmosis-Apply-1.7B
Text Generation
• 2B • Updated • 57
• • 96
33B • Updated • 13
• 193
numind/NuMarkdown-8B-Thinking
Image-to-Text
• 8B • Updated • 31.7k
• 472
moonshotai/Kimi-K2-Thinking
Text Generation
• 1.1T • Updated • 169k
• • 1.7k
Text Generation
• 2B • Updated • 888
• • 524
MaziyarPanahi/VibeThinker-1.5B-GGUF
Text Generation
• 2B • Updated • 356
• 36
ServiceNow-AI/Apriel-1.5-15b-Thinker
Image-Text-to-Text
• 15B • Updated • 279
• 469
Image-Text-to-Text
• 10B • Updated • 51.4k
• • 608
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
Image-Text-to-Text
• 28B • Updated • 149k
• • 2.87k