Hugging Face Generative AI Services (HUGS) documentation

Supported Hardware Providers

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Supported Hardware Providers

HUGS are optimized for a wide-variety of accelerators for ML inference, and support across different accelerator families and providers will continue to grow exponentially.

NVIDIA GPUs

NVIDIA GPUs are widely used for machine learning and AI applications, offering high performance and specialized hardware for deep learning tasks. NVIDIA’s CUDA platform provides a robust ecosystem for GPU-accelerated computing.

Supported device(s):

  • NVIDIA A10G: 24GB GDDR6 memory, 9216 CUDA cores, 288 Tensor cores, 72 RT cores
  • NVIDIA L4: 24GB GDDR6 memory, 7168 CUDA cores, 224 Tensor cores, 56 RT cores
  • NVIDIA L40S: 48GB GDDR6 memory, 18176 CUDA cores, 568 Tensor cores, 142 RT cores
  • NVIDIA A100: 40/80GB HBM2e memory, 6912 CUDA cores, 432 Tensor cores, 108 RT cores
  • NVIDIA H100: 80GB HBM3 memory, 14592 CUDA cores, 456 Tensor cores, 144 RT cores

AMD GPUs

AMD GPUs provide strong competition in the AI and machine learning space, offering high-performance computing capabilities with their CDNA architecture. AMD’s ROCm (Radeon Open Compute) platform enables GPU-accelerated computing on Linux systems.

Supported device(s):

  • AMD Instinct MI300X: 192GB HBM3 memory, 304 Compute Units, 4864 AI Accelerators

AWS Accelerators (Inferentia/Trainium)

Coming soon

Google TPUs

Coming soon

< > Update on GitHub