view article Article How to deploy and fine-tune DeepSeek models on AWS By pagezyhf and 2 others • Jan 30 • 52
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers By sayakpaul and 1 other • Jul 30, 2024 • 64
view article Article quanto: a pytorch quantization toolkit By dacorvo and 2 others • Mar 18, 2024 • 35
view article Article Hugging Face Text Generation Inference available for AWS Inferentia2 By philschmid and 1 other • Feb 1, 2024 • 5
view article Article Make your llama generation time fly with AWS Inferentia2 By dacorvo • Nov 7, 2023 • 1