onnx support

#2
by thewh1teagle - opened

Hi!
Thank you for the model it's very useful.
I would like to convert this model to onnx framework for creating lightweight library, it also possible to quantize the model then.
I successfully converted the model to onnx and now it looks like I only need the tokenizer.
I'm not sure how to implement it, I see some configs in this repo but still, not sure which one of them and how to use them and if there's other things required.
I created the repo dicta-onnx
can you help with that? thank you.

thewh1teagle changed discussion title from onnx inference support to onnx support
DICTA: The Israel Center for Text Analysis org

Love the idea! Can you use the HF library?

If so, it's a simple

from transformers import BertTokenizerFast

tok = BertTokenizerFast(tokenizer_file="tokenizer.json")

inputs = tok(texts, return_tensors="np"

I'll try later and see if I can use this as inputs
How can I decode the outputs?

DICTA: The Israel Center for Text Analysis org

I successfully converted the model to onnx and published new Python library, it's faster and I even created quantized model (int8) which it's weight is only ~300MB!
See https://github.com/thewh1teagle/dicta-onnx
Thank you!

DICTA: The Israel Center for Text Analysis org

Very nice! We'll add it to the model card shortly !

Also created HF space to add diacritics to text with quantized model
https://huggingface.co/spaces/thewh1teagle/add-diacritics-in-hebrew

DICTA: The Israel Center for Text Analysis org
Shaltiel changed discussion status to closed

Sign up or log in to comment