view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • 16 days ago • 146
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes By ybelkada and 1 other • Aug 17, 2022 • 94
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA By ybelkada and 4 others • May 24, 2023 • 153
view article Article Making LLMs lighter with AutoGPTQ and transformers By marcsun13 and 5 others • Aug 23, 2023 • 55
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 1 day ago • 163
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10, 2024 • 71
view article Article A Dive into Pretraining Strategies for Vision-Language Models By adirik and 1 other • Feb 3, 2023 • 69
view article Article CodeGemma - an official Google release for code LLMs By pcuenq and 5 others • Apr 9, 2024 • 101
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs Paper • 2403.20041 • Published Mar 29, 2024 • 35