ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 6 days ago • 54
Pleias-RAG Collection New generation of small reasoning models for RAG, search, and source summarization. • 4 items • Updated 19 days ago • 26
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 25 days ago • 191
Open-RS Collection Model weights & datasets in the paper "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t" • 8 items • Updated Mar 21 • 12
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 28
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10, 2024 • 28
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 8 days ago • 258
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 114
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22, 2024 • 95
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Paper • 2407.04363 • Published Jul 5, 2024 • 34
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27, 2024 • 150
UDOP Collection UDOP is a general multimodal model for document AI • 4 items • Updated 12 days ago • 24
Latent Consistency Models LoRAs Collection Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 103