Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
XenovaΒ 
posted an update 1 day ago
Post
1512
Introducing Kokoro.js, a new JavaScript library for running Kokoro TTS, an 82 million parameter text-to-speech model, 100% locally in the browser w/ WASM. Powered by πŸ€— Transformers.js. WebGPU support coming soon!
πŸ‘‰ npm i kokoro-js πŸ‘ˆ

Try it out yourself: webml-community/kokoro-web
Link to models/samples: onnx-community/Kokoro-82M-ONNX

You can get started in just a few lines of code!
import { KokoroTTS } from "kokoro-js";

const tts = await KokoroTTS.from_pretrained(
  "onnx-community/Kokoro-82M-ONNX",
  { dtype: "q8" }, // fp32, fp16, q8, q4, q4f16
);

const text = "Life is like a box of chocolates. You never know what you're gonna get.";
const audio = await tts.generate(text,
  { voice: "af_sky" }, // See `tts.list_voices()`
);
audio.save("audio.wav");

Huge kudos to the Kokoro TTS community, especially taylorchu for the ONNX exports and Hexgrad for the amazing project! None of this would be possible without you all! πŸ€—

The model is also extremely resilient to quantization. The smallest variant is only 86 MB in size (down from the original 326 MB), with no noticeable difference in audio quality! 🀯

Hi Xenova, I have a project with essentially the same goal but without using transformers.js or npm: https://github.com/Shubin123/kokorojs

The work done with transformers.js would have saved me a ton of work and headache and I will be using it in the future.

My phenomizer is still broken is it ok if i take some of your code/logic from https://github.com/hexgrad/kokoro/blob/main/kokoro.js/src/phonemize.js

I will probably just leave my project where it's at currently with any minor bug fixes and model updates if available.
If no reply I will respect that and just leave my phenomizer broken cause I dont want to pretend like I didnt look at your version. (my tokenizer is probably not up to par either).

Β·

Hey! Oh that's awesome - great work! Feel free to adapt any code/logic of mine as you'd like!

In this post