Add/update the quantized ONNX model files and README.md for Transformers.js v3

#4
by whitphx HF Staff - opened

Applied Quantizations

✅ Based on model.onnx with slimming

↳ ✅ fp16: model_fp16.onnx (added)
↳ ✅ int8: model_int8.onnx (added)
↳ ✅ uint8: model_uint8.onnx (added)
↳ ✅ q4: model_q4.onnx (added)
↳ ✅ q4f16: model_q4f16.onnx (added)
↳ ✅ bnb4: model_bnb4.onnx (added)

Xenova changed pull request status to merged

Sign up or log in to comment