quantized model?

#26

by gestalt73 - opened May 16

May 16

any options to request / or generate an 8 bit or 4 bit quantized version of the model? This would be a hoot to use on raspberry pi 5 and rockchip sbcs.

Thanks!

Alan

gestalt73

May 16

Actually I answered my own question... kinda

This library and onnx file transcribes the sample wav file in 1/2 the time on an rk3588 sbc (1.6 seconds vs 3.7 seconds for 7 seconds of audio)

https://github.com/istupakov/onnx-asr
https://huggingface.co/istupakov/parakeet-tdt-0.6b-v2-onnx

I'd still like to find an 8bit or 4bit quantized version of the file but this parakeet model is fast either way. :-)

istupakov

May 16

I'd still like to find an 8bit or 4bit quantized version of the file but this parakeet model is fast either way. :-)

Hi @gestalt73 !
You can use 8bit quantized version with my library:

import onnx_asr
model = onnx_asr.load_model("nemo-parakeet-tdt-0.6b-v2", quantization="int8")
print(model.recognize("test.wav"))

gestalt73

May 16

Oh nice! Thanks! I missed that.

Updated stats on my rockchip rk3588 sbc with the 7 second sample:

nemo transcription: 3.7 seconds, 1.89x realtime
onnx_asr (16bit): 1.5 seconds, 4.66x realtime
onnx_asr (8bit): 0.9 seconds,7.77x realtime

gestalt73 changed discussion status to closed May 16

NullSense

28 days ago

•

edited 28 days ago

https://github.com/NullSense/Parrator/

I made this simple tool, able to run a daemon and quickly interact with a shortcut to start/stop recording, auto pasting supported, configurable.

Perhaps some of you like this. Quite amazed at the speed of parakeet.

Daemon: Transcription Stats - Chars: 256, Words: 47
Daemon: Attempting to auto-paste from clipboard in 0.5 seconds...
         Ensure a text field is active and focused!
Daemon: Paste simulated.

--- Performance Summary (Daemon Mode) ---
Total time (rec start to paste end): 15.092s
  Recording duration:                  14.136s
  VAD processing duration:             0.002s
  Audio processing after VAD:        0.003s
  ASR Transcription duration:          0.334s
  Clipboard & Paste duration:        0.583s
----------------------------------------

Some dirty benchmarking. Running on RX 6750XT Win11

Berrisius

28 days ago

This comment has been hidden (marked as Resolved)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment