mistralai/Mistral-Small-3.1-24B-Instruct-2503 Image-Text-to-Text • Updated 4 days ago • 97.9k • • 1.21k
Running 2.58k 2.58k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated Apr 3 • 147
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated Feb 10 • 78