Quantization command

by utarn - opened 14 days ago

Discussion

utarn

14 days ago

Would you mind sharing and guide me how to quantize this model to nvfp4?

shanjiaz

Owner 14 days ago

This is quantized using nvidia modelopt:
python hf_ptq.py --pyt_ckpt_path <gpt-oss-120b> --qformat nvfp4 --export_path <gpt-oss-120b-nvfp4> --trust_remote_code

with the latest modelopt main branch https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/llm_ptq

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment