view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference By mfuntowicz and 1 other • Jan 16 • 75
view article Article Deploy models on AWS Inferentia2 from Hugging Face By philschmid and 1 other • May 22, 2024 • 13