Add/update the quantized ONNX model files and README.md for Transformers.js v3

by whitphx HF Staff - opened about 21 hours ago

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

+34

-0

Add/update the quantized ONNX model files and README.md for Transformers.js v3473306a0

whitphx

about 21 hours ago

Applied Quantizations

✅ Based on `model.onnx` with slimming

↳ ✅ fp16: model_fp16.onnx (added)
↳ ✅ int8: model_int8.onnx (added)
↳ ✅ uint8: model_uint8.onnx (added)
↳ ✅ q4: model_q4.onnx (added)
↳ ✅ q4f16: model_q4f16.onnx (added)
↳ ✅ bnb4: model_bnb4.onnx (added)

Upload README.md with huggingface_hub26c30874

Xenova changed pull request status to merged about 7 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Applied Quantizations

✅ Based on model.onnx with slimming

✅ Based on `model.onnx` with slimming