ai4bharat
/

indic-parler-tts

text2text-generation

Model card Files Files and versions Community

Information

#7

by Sarvjeet001 - opened Dec 28, 2024

Dec 28, 2024

Hi, thanks for you model... i am using it for exploring TTS but i have some queries that need your guidance:

What is the reason that your model can generate max 30 sec audio ? Is this depends on the audios length of training data? And if yes, can we increase a TTS model audio output time 30 sec to 3 min by changing training dataset audios from seconds to minutes ?
Is there any issue in audio output if we generate it longer then 30 sec by using any possible way?
Why your model has max word limit 20 for better result? On what it depends? and how can we increase it ?

Feb 12

Hi,

some experiments to increase word limit, I used chunking + batch generation to increase word limit.

https://github.com/slabstech/llm-recipes/blob/main/python/notebooklm/audiobook/utils/batch_inference_chunked.py

Usage in a server
https://github.com/slabstech/parler-tts-server/blob/c83a89d7468610744eecab9f38fbaef691641efc/parler_tts_server/main.py#L191
Function call
https://github.com/slabstech/llm-recipes/blob/ed08fa0301ceb8b936edf49f271975734f869438/python/notebooklm/tts_generator.py#L183

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment