MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs Paper • 2508.05257 • Published 7 days ago • 8
view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • 6 days ago • 43
google/siglip-so400m-patch14-384 Zero-Shot Image Classification • 0.9B • Updated Sep 26, 2024 • 3.56M • 583