Quantization training technique used for all miniLM L6 v2 quantized model.
#100
by
learnerX
- opened
Which quantization training technique used for all miniLM L6 v2 quantized model ??
Hello!
It depends, we have (u)int8 quantized models using arm64
, avx2
, avx512
, avx512_vnni
quantization as defined here: https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/configuration#optimum.onnxruntime.AutoQuantizationConfig
And with OpenVINO, we use Static Quantization as described here: https://huggingface.co/docs/optimum/main/en/intel/openvino/optimization#static-quantization
- Tom Aarsen
Will quantization aware training technique can preserve the accuracy of all mini lm l6v2 model if we want to apply this technique ??