How to stream
On the https://huggingface.co/spaces/hexgrad/Kokoro-TTS page, there is a stream option, which appears to do inference over the very first part of the text, and while this is playing, it goes on to the next part of the text, then somehow adds the additional audio on to the same player (the total time on the player increases as it goes) and you can seemlessly start listening almost immediately even to very long amounts of text.
Is there a way to do this locally? Can I use python somehow with Kokoro to start streaming audio before the whole thing is complete?
My use case is that I'd like to build an addon for audiobookshelf which lets you create an audiobook from an ebook and starts streaming it right away even while the audiobook is being created. Audiobookshelf runs on a server but has a local app that runs on mobile, where you can read ebooks or listen to audiobooks that come from the server locally. It would be nice to be able to create an audiobook and not have to wait for the whole thing to finish before starting to listen.
Yes, of course. Python supports multiple threads, so you can have one thread responsible for sentence-by-sentence speech generation and another playing back that speech once the previous sentence has finished playing. The first thread adds the speech data to the end of a queue while the reader thread takes them off the start of the queue. Even a CPU like the Ryzen 5600x does the TTS fast enough to keep up with normal narration speeds.
There are plenty of local implementations of Kokoro TTS on GitHub to get you started.