view article Article Welcome FalconMamba: The first strong attention-free 7B model By JingweiZuo and 5 others • Aug 12, 2024 • 110
view article Article Welcome Llama 3 - Meta's new open LLM By philschmid and 4 others • Apr 18, 2024 • 286
view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware By Titus-von-Koeller and 8 others • Mar 20, 2024 • 28
view article Article quanto: a pytorch quantization toolkit By dacorvo and 2 others • Mar 18, 2024 • 35
view article Article Fine-Tuning Gemma Models in Hugging Face By svaibhav and 3 others • Feb 23, 2024 • 29
view article Article Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face By lewtun and 6 others • Dec 11, 2023 • 12
view article Article Overview of natively supported quantization schemes in 🤗 Transformers By ybelkada and 4 others • Sep 12, 2023 • 12
view article Article Making LLMs lighter with AutoGPTQ and transformers By marcsun13 and 5 others • Aug 23, 2023 • 45
view article Article The Falcon has landed in the Hugging Face ecosystem By lvwerra and 7 others • Jun 5, 2023 • 13
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA By ybelkada and 4 others • May 24, 2023 • 130
view article Article Introducing RWKV — An RNN with the advantages of a transformer By BlinkDL and 3 others • May 15, 2023 • 18
view article Article StackLLaMA: A hands-on guide to train LLaMA with RLHF By edbeeching and 6 others • Apr 5, 2023 • 33
view article Article Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU By edbeeching and 5 others • Mar 9, 2023 • 44
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes By ybelkada and 1 other • Aug 17, 2022 • 81