view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints By sergeipetrov and 3 others • May 1, 2024 • 80
LLaMA3-Quantization Collection This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study” • 9 items • Updated Apr 23, 2024 • 4