Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 8 items • Updated 13 days ago • 116
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub Feb 12 • 61
llama.vim Collection Recommended models for the llama.vim and llama.vscode plugins • 9 items • Updated Mar 12 • 28
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 113
Llama 3.2 3B & 1B GGUF Quants Collection Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated Sep 26, 2024 • 46