f5-tts-small model, giving output in 1.5 secs, I need it in under 200-300 ms how to proceed
#16
by
banank1989
- opened
f5-tts-small model, giving output in 1.5 secs, I need it in under 200-300 ms how to proceed
Try torch.compile(). Also check out EPSS from recent F5 updates. Other than that quantization and inferencing on better GPUs will help.