How to run model on 8xH200

by U2hhd24 - opened 19 days ago

19 days ago

Hi,

There are some questions about the serving on 8xH200,

Does this quantized version of FP4 support H200?
Is there any way to serving this model with docker image, like nvcr.io/nvidia/tritonserver:25.05-py3?

Thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment