Quantization command

#1
by utarn - opened

Would you mind sharing and guide me how to quantize this model to nvfp4?

This is quantized using nvidia modelopt:
python hf_ptq.py --pyt_ckpt_path <gpt-oss-120b> --qformat nvfp4 --export_path <gpt-oss-120b-nvfp4> --trust_remote_code

with the latest modelopt main branch https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/llm_ptq

Sign up or log in to comment