I made maya1 faster and output 48khz audio instead of 24khz
I used lmdeploy to run maya1 and it's really fast now. For long paragraphs, it can generate almost 50sec of speech in 1 second. I also used a custom upsampler to upsample maya1's 24khz output to 48khz. Hope this can help if anyone wants a faster maya.
Any ideas how to improve inference speed when running GUFF version of the model on device?
Are you using cpu? If so, gguf with quantization is about the fastest you can get. This is a big model, so won’t be very fast on cpu. If you are using gpu, fast maya repo should be the fastest.
50 sec of speech in 1 sec wao how to test that
@rahul7star
This is specifically for large paragraphs, the speedup isn't as great for single sentences (although there still is one). I will improve that considerably with awq quant for maya.
You can test it by installing the requirements and loading the model, code is here: https://github.com/ysharma3501/FastMaya
FastMaya repo should be faster I believe. Zero gpu creates some latency so might not be fair to compare but for single sentences, FastMaya was roughly 1.8x realtime on an A100 while your space seems to be 0.6x realtime. Does your space use a custom model? My repo doesn't support that right now but I can add that if you want.
amazing @YatharthS