Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family Paper • 2504.18225 • Published 14 days ago • 12
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published 10 days ago • 90
Atropos Artifacts Collection A collection of experimental artifacts created with Atropos, Nous' RL Environments framework - https://github.com/NousResearch/Atropos • 8 items • Updated 4 days ago • 6
Pleias-RAG Collection New generation of small reasoning models for RAG, search, and source summarization. • 4 items • Updated 15 days ago • 26
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users Paper • 2504.10157 • Published 25 days ago • 17
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing Paper • 2504.07964 • Published 29 days ago • 61
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 21 days ago • 187
RLVR Collection Model and data for 'Expanding RL with Verifiable Rewards Across Diverse Domains' • 3 items • Updated Mar 31 • 11
Hamanasu Collection A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay - R/Locallama keeps DELETING MY GODDAMN COMMENTS • 31 items • Updated 1 day ago • 8
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published Mar 13 • 28
DeepHermes Collection Preview models of hybrid reasoner Hermes series • 6 items • Updated Mar 13 • 35
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 410
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published Mar 7 • 78