Running 3.62k The Ultra-Scale Playbook 🌌 3.62k The ultimate guide to training LLM on large GPU Clusters
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14, 2025 • 300
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 147
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 67
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14, 2024 • 51
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20, 2024 • 63
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published Jul 17, 2024 • 42