Quantization training technique used for all miniLM L6 v2 quantized model.

#100
by learnerX - opened

Which quantization training technique used for all miniLM L6 v2 quantized model ??

Sentence Transformers org

Hello!

It depends, we have (u)int8 quantized models using arm64, avx2, avx512, avx512_vnni quantization as defined here: https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/configuration#optimum.onnxruntime.AutoQuantizationConfig
And with OpenVINO, we use Static Quantization as described here: https://huggingface.co/docs/optimum/main/en/intel/openvino/optimization#static-quantization

  • Tom Aarsen

Will quantization aware training technique can preserve the accuracy of all mini lm l6v2 model if we want to apply this technique ??

Sign up or log in to comment