How to run model on 8xH200

#2
by U2hhd24 - opened

Hi,

There are some questions about the serving on 8xH200,

  1. Does this quantized version of FP4 support H200?
  2. Is there any way to serving this model with docker image, like nvcr.io/nvidia/tritonserver:25.05-py3?

Thank you!

Sign up or log in to comment