RLMs (Reasoning Language Models) - a sugatoray Collection

sugatoray 's Collections

Papers + RL/Reasoning

RLMs (Reasoning Language Models)

Books And Notes

Reasoning Datasets

SmolAgents Tools (Spaces)

Bookmark::Models

LLM Training Datasets

Leaderboards 🔥

Papers-Fundamentals

TFM: TimeSeries Foundation Models

Papers-Benchmarks

LLMs-EmbeddingModels

LLM + Datasets : Finance

RLMs (Reasoning Language Models)

updated Mar 25

LADDER: Self-Improving LLMs Through Recursive Problem Decomposition

Paper • 2503.00735 • Published Mar 2, 2025 • 23
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Paper • 2503.05592 • Published Mar 7, 2025 • 27
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning

Paper • 2503.05379 • Published Mar 7, 2025 • 38
RekaAI/reka-flash-3

21B • Updated Mar 13, 2025 • 149 • 391
RekaAI/VibeEval

Viewer • Updated Dec 12, 2024 • 269 • 413 • 47
Qwen/QwQ-32B

Text Generation • 33B • Updated Mar 11, 2025 • 58.9k • • 2.93k
open-r1/OlympicCoder-7B

Text Generation • 8B • Updated Mar 31 • 279 • • 185
open-r1/OlympicCoder-32B

Text Generation • 33B • Updated Mar 31 • 77 • • 156
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20, 2025 • 52
predibase/Predibase-T2T-32B-RFT

33B • Updated Mar 19, 2025 • 10 • 20
agentica-org/DeepCoder-1.5B-Preview

Text Generation • 2B • Updated Apr 9, 2025 • 87 • • 75
agentica-org/DeepCoder-14B-Preview

Text Generation • 15B • Updated May 11, 2025 • 401 • • 681
reasonir/ReasonIR-8B

Feature Extraction • 8B • Updated May 13, 2025 • 2.56k • 56
deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29, 2025 • 6.15M • • 2.45k
nvidia/Nemotron-Research-Reasoning-Qwen-1.5B

Text Generation • 2B • Updated Nov 21, 2025 • 3.31k • • 241
Haoz0206/Omni-R1

Video-Text-to-Text • 9B • Updated May 28, 2025 • 19 • 23
mistralai/Magistral-Small-2506

24B • Updated Jul 28, 2025 • 50k • 608
microsoft/Phi-4-mini-reasoning

Text Generation • 4B • Updated Dec 10, 2025 • 56.8k • • 230
microsoft/Phi-4-mini-flash-reasoning

Text Generation • 4B • Updated Dec 10, 2025 • 1.09k • 278
microsoft/Phi-4-reasoning

Text Generation • 15B • Updated Nov 24, 2025 • 9.25k • 227
osmosis-ai/Osmosis-Apply-1.7B

Text Generation • 2B • Updated Jul 3, 2025 • 57 • • 96
MetaStoneTec/XBai-o4

33B • Updated Aug 6, 2025 • 13 • 193
numind/NuMarkdown-8B-Thinking

Image-to-Text • 8B • Updated about 3 hours ago • 31.7k • 472
moonshotai/Kimi-K2-Thinking

Text Generation • 1.1T • Updated Jan 30 • 169k • • 1.7k
WeiboAI/VibeThinker-1.5B

Text Generation • 2B • Updated Nov 24, 2025 • 888 • • 524
MaziyarPanahi/VibeThinker-1.5B-GGUF

Text Generation • 2B • Updated Nov 20, 2025 • 356 • 36
ServiceNow-AI/Apriel-1.5-15b-Thinker

Image-Text-to-Text • 15B • Updated Oct 6, 2025 • 279 • 469
zai-org/GLM-4.6V-Flash

Image-Text-to-Text • 10B • Updated Dec 9, 2025 • 51.4k • • 608
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated Apr 6 • 149k • • 2.87k