Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
2
44
Testing
Python2231
Follow
AI & ML interests
None yet
Recent Activity
reacted
to
hexgrad
's
post
with 👍
about 9 hours ago
I wrote an article about G2P: https://hf.co/blog/hexgrad/g2p G2P is an underrated piece of small TTS models, like offensive linemen who do a bunch of work and get no credit. Instead of relying on explicit G2P, larger speech models implicitly learn this task by eating many thousands of hours of audio data. They often use a 500M+ parameter LLM at the front to predict latent audio tokens over a learned codebook, then decode these tokens into audio. Kokoro instead relies on G2P preprocessing, is 82M parameters, and thus needs less audio to learn. Because of this, we can cherrypick high fidelity audio for training data, and deliver solid speech for those voices. In turn, this excellent audio quality & lack of background noise helps explain why Kokoro is very competitive in single-voice TTS Arenas.
liked
a Space
about 9 hours ago
Mistral-AI-Game-Jam/description-improv
reacted
to
victor
's
post
with ❤️
2 days ago
Hey everyone, we've given https://hf.co/spaces page a fresh update! Smart Search: Now just type what you want to do—like "make a viral meme" or "generate music"—and our search gets it. New Categories: Check out the cool new filter bar with icons to help you pick a category fast. Redesigned Space Cards: Reworked a bit to really show off the app descriptions, so you know what each Space does at a glance. Random Prompt: Need ideas? Hit the dice button for a burst of inspiration. We’d love to hear what you think—drop us some feedback plz!
View all activity
Organizations
None yet
models
1
Python2231/whisper-large-v3-turbo-es-flax
Automatic Speech Recognition
•
Updated
Nov 26, 2024
•
77
datasets
None public yet