Quantization command
#1
by
utarn
- opened
Would you mind sharing and guide me how to quantize this model to nvfp4?
This is quantized using nvidia modelopt:python hf_ptq.py --pyt_ckpt_path <gpt-oss-120b> --qformat nvfp4 --export_path <gpt-oss-120b-nvfp4> --trust_remote_code
with the latest modelopt main branch https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/llm_ptq