AICoverGen
Launch a web interface for model interaction
Launch a web interface for model interaction
Generate voice with Style-Bert-VITS2 editor
Generate and convert voice with text and audio inputs
Generate speech from text using AI models
Launch a web interface for model interaction
Convert audio using RVC models
Combine and process audio files
Vocal and background audio separator
Generate audio from text using BlueArchiveTTS
[中文/English/日本語]multilingual text-to-speech
Generate audio from text prompts
A simple, high-quality voice conversion tool
Clone a voice to speak any text
Voice conversion framework based on VITS
Run a web-based user interface
Install and run Applio audio processing app
Download and run voice conversion model
Generate audio from text using voice synthesis
Generate anime character voice from text
Generate audio from text with voice conversion
Generate audio from text using VITS model
Generate speech from text
Generate audio from text using various voice styles
Generate MIDI music from prompts
Generate Japanese lyrics from a title and prompt
Generate audio from text using voice prompts
Generate Japanese speech from text
Generate Japanese audio from text
Generate speech from text
Generate audio from text with ChatGPT integration
Generate realistic audio from text
Generate voice from text using a reference audio
Generate voice from text using a reference audio
Generate speech from text in multiple languages
Generate speech from text using a reference voice
Generate music from text descriptions and optional melodies
Convert or reconstruct audio using voice samples
Generate Talking avatars from Text-to-Speech
Convert and modify audio voices
Generate audio for silent videos
Languages ru,en,zh-cn,ja,de,fr,it,pt,pl,tr,ko,nl,cs,ar,es,hu
Execute dynamic code
Get a music sample inspired by the mood of an image
In-browser speech recognition w/ word-level timestamps
Vote on the latest TTS models!
Text-To-Speech (TTS) Evaluation using objective metrics.
Search and explore LAKH MIDI dataset with MidiCaps
Generate audio from text descriptions with timestamps
Search and explore 179k+ MIDI titles
Transcribe audio with emotions and events
Separate speakers in audio recordings
Transcribe audio to text with speaker diarization
Separate vocals and background from audio
Print "hello"
Convert text to speech using band character voices
Generate audio from text with speaker selection and language translation
Generate captions from audio files
Convert and reconstruct speech files
Text-to-speech (TTS) with Next-gen Kaldi
Vote on the top Japanese TTS models!
Convert text to voice using a musical model
Genshin Impact game style music generation
Generate speech from text using various voices
easy training helper For RVC
Generate talking face video from image and audio
Convert and manipulate audio using various models
Classify audio into NSFW categories
Generate voice with Style-Bert-VITS2 editor
Generate speech from text
A demo of RVC pip
Clone voices from audio files
Generate speech from text in multiple languages
Harmonize and mix any MIDI melody
Create a spectrogram and get audio info
Convert and separate audio using models and TTS
Separate audio into instrumental and vocal tracks
Generate speech from text using reference audio
Generate audio from text or PDF with optional translation
Remove vocals from an audio file
High-fidelity Text-To-Speech
Generate audio from text using reference audio
Generate audio from text using a voice synthesis model
Generate Animalese audio from text
Convert text to Animalese voice
Generate audio from text prompts
Generate speech from text using Microsoft Edge TTS
Transcribe and summarize YouTube videos or audio files
Launch a web interface for downloading YouTube videos
Generate animal-like speech from text
Launch a web interface for model interaction
Convert audio and images to different formats
Convert and train voice models
An easy-to-use voice conversion framework based on VITS.
Clone voices by typing text and providing a reference audio file
Generate speech and translate audio using AI models
Transform a report or document into an interview/discussion
Super fastest Voice Assistant
Convert text to speech with reference audio
Generate music using descriptions and optional melody audio
Generate audio with voice conversion
Transform and render any MIDI
Generate POP music medley with Imagen diffusion transformer
Classify absolutely any MIDI by genre, song and artist
Intelligently compare any pair of MIDIs
Explore and download stable speaker embeddings for ChatTTS
Generate a seamless bridge between two composition parts
Generate audio from text or voice input
Generate audio from text
Fixed fork of the original audio sr!
Convert voice to match another's style or tone
Generate audio response from uploaded audio
Retrieval augmented harmonization of any MIDI melody
Add a unique melody to any MIDI file
Mix chords from one MIDI to another MIDI
Convert Morse code to audio
Convert and modify voices in audio files
Get Lyrics from Genius's Link
Groq API Playground
Generate speech quality score from audio
LMSYS bench for audio agents
Generate lip-synced video from image or video and audio
Create lifelike animated videos using a photo and audio
Create a video with lip-synced audio
Description of Matcha TTS Japanese
Generate clean audio from noisy recordings
High-fidelity Text-To-Speech
Generate and edit audio from text prompts
Transcribe audio to text with timestamps
Benchmark load model and tts time
Give your space a voice! (Demo)
Generate text based on audio input and questions
Describe audio with questions
Generate audio from text
Generate music from text descriptions
Enjoy TTS Chat
Create interactive spoken dialogue using audio input
Generate audio from text and reference audio
Controlled source augmented rock music transformer
Long-form Musicgen
Convert text to speech in multiple languages
Generate Japanese speech from text
Generate菅義偉's voice from text
Generate a lip-synced video from audio
Transform text into engaging podcast dialogues or detailed reports
View and request speech recognition model benchmarks
Personalised Podcasts For All - Available in 13 Languages
Transcribe and translate Japanese & English audio
Fast, efficient, & multilingual text-to-speech
Transcribe and translate audio into text
Generate audio from text
Transcribe audio or YouTube videos into text
Realtime implementation of Whisper large turbo
ML-powered speech recognition directly in your browser
ExpressivText-to-Speech
Generate speech from text with accentuation
Download audio or video from a URL
unlimited Audio generation with a few added features
Restore degraded audio using a Transformer-based model
whisper3 turbo
Generate audio from text using selected character voices
Generate speech from text with customizable settings
Generate speech from text
Transform audio with pre-trained models and customize settings
Transcribe audio to text with style options
Separate vocals and instruments from audio
Enhance and denoise your audio files
Transcribe audio to text
Convert audio to a different voice
Generate a podcast from text, URLs, PDFs, and images
Generate and apply matching music background to video shot
Generates a sound effect that matches video shot
Generates audio environment from an image
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Blind vote on HF TTS models!
CPU powered, low RTF, emotional, multilingual TTS
Generate music powered by AI
Co-Speech Gesture Video Generation
Transcribe Japanese audio to text
Generate text from audio recordings
Whisper model to transcript japanese audio to katakana.
Better AI powered platform to purify your speech signal
Text | Image | Audio | Video to Spectrogram || Steganography
Generate Cover From AI Voice Model
Separate audio into stems using various models
Generate text responses from audio input
Transcribe and diarize audio files or microphone input
Stable audio open model from Synthio paper.
Fast & efficient ASR outperforming Whisper!
Generate a video from audio with customizable waveform visualization
Generate MIDI music using RWKV v4!
Whisper Transcribe MP3 files, use a GPU to convert faster!
Efficient, fast, +non-native languages & Lojban
MaskGCT TTS Demo
Generate music from text descriptions
Transcribe audio or YouTube videos into text
Self-correcting multi-instrumental chords transformer
Chords-conditioned music transformer
Ultra-fast Whisper Turbo inference ⚡
Generate audio and waveform video from text
In-Browser Audio Wake-Word Spotting
Streamlit pianoroll playback element
Audio-Separator by Politrees
Fast multi-instrumental music transformer
Streamlit browser for piano music datasets.
Demo of masking tasks from the PIANO dataset
An end-to-end (e2e) Voice Language Model by Fish Audio.
Separate audio stems and convert to MIDI
Generate podcasts with AI avatars
Create personalized voice clips with情感
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Generate and clone voices from text or audio
Generate MIDI music sequences
Did StyleTTS 2 generate that audio?!?
base model for mono-channel completion
Generate audio from text with custom speakers
Launch a web interface for text-to-speech and SSML processing
Upgraded to v1.0!
Generate voice audio from text input
Generate text from audio input
Generate music for a video based on its content and key
Generate audio from text descriptions
Generate Voice Clones
Spanish finetune for the original F5 model.
Generate speech from text
Convert audio to lip-sync data
Generate voice from text using ごちうさ TTS
Generate speech from text
シャルティアのAI音声合成モデルを作りました。
早乙女乱馬(女)のAI音声合成モデルを作りました。
ベアトリスのAI音声合成モデルを作りました。
Talk to Fixie.ai's Ultravox with WebRTC ⚡️
Estimate physical properties merely from pouring sound!
Create interactive HTML web pages with your voice
Record an audio, then use AI to transcribe and translate it.
Large and fast music transformer for pitches inpainting
a lightweight TTS for Natural Anime speech generation.
Generate music based on text and melody
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a
Versatile audio super resolution (any -> 48kHz) with AudioSR
Generate human-like speech from text
TTS tool
short_description: 猫屋敷まゆのAI音声合成モデルを作りました。
Find similar game voice samples
A demo of Indic Parler-TTS
Convert text to speech and speech to text
Verify speakers using voice samples
Target Speaker Extraction with WeSep
Transcribe audio to text
View and request model performance data
WebGPU text-to-Speech powered by OuteTTS and Transformers.js
A home for scoring speech quality
Non official benchmark by Fish Speech
Generate chupa sounds from text or audio
Generate Japanese audio from text
Generate Japanese audio from text
Detect emotions from an audio file
Generate audio from video or text prompts
SText to Audio(Sound SFX) Generator
Talk to Kyutai's moshi - powered by Gradio WebRTC!
Generate high-quality speech from text using a prompt audio
Talk to the Gradio docs! Powered by Pydantic and WebRTC ⚡️
"One-minute creation by AI Coding Autonomous Agent MOUSE-I"
Generate music from text prompts
Analyze SEO of a website
Classify audio samples into categories
Real-time in-browser speech recognition
Talk with openAI's new Realtime Voice API
Separate sounds from audio mixtures using text prompts
Music Genre Classifier
Guzheng Performance Technique Recognizer
Chinese Traditional Instrument Sound Retriever
Chinese Music Pentatonic Mode Detector
Convert audio to images
Video to Audio
Transcribe audio from URLs or uploads
Make your audio to 8D
Python Audio Separator Demo
Yet another Real-time Whisper with WebGPU, written in Vue
Identify any MIDI
Yet another Real-time in-browser STT, re-implemented in Vue
AI driven VTuber & Companion, supports Live2D and VRM.
Convert figured bass to chord
Turn any ebook into audiobook, 1107+ languages supported!
V1.0Convert any Ebook to AudioBook with Xtts + VoiceCloning!
Converts Ebooks into audiobooks with piper-tts
Ebook2audiobook docker space beta
Online programming aids
Ready-to-play synth instrument!
Genshin Impact & Honkai Star Rail game character voice TTS
Erhu Performance Technique Recognizer
Discriminator of Bel Canto and Chinese Folk Singing
Piano Sound Quality Classifier
Discriminator of Chest Vocie and Falsetto
Ultra-fast and very well fitted solo Piano music transformer
ヘスティアのAI音声合成モデルを作りました。
フレイヤのAI音声合成モデルを作りました。
Hands-Free AI Voice Chat with a Retro Vibe
Hands-Free AI Voice Chat with a Retro Vibe
Hands-Free AI Voice Chat with a Retro Vibe
https://huggingface.co/spaces/VIDraft/mouse-webgen
Separate music and vocals from audio
A benchmark for open-source multi-dialect Arabic ASR models
Generate music from text prompts
High-fidelity Text-To-Speech
Audio Conditioned LipSync with Latent Diffusion Models
Transform your voice into a singer's
Generate speech from text with different speakers
Audio edit
✨[With v1.0.0] Accelerated TTS on Kokoro-82M
📚PDF 🪄Text to 🗣️Speech 🤖Transformer
Generate a talking face video from an image and audio
Communicate with an AI assistant and convert text to speech
Convertir texto a voz gratis
https://huggingface.co/spaces/VIDraft/mouse-webgen
2
Text to Audio (Sound SFX) Generator
Generate voice covers from audio or text input
Search and find karaoke MIDI files
Search for music by description
G2P
Better AI powered platform to purify your speech signal
結束いのりのAI音声合成モデルを作りました。
ドラクエ3の女勇者のAI音声合成モデルを作りました。
喜屋武飛夏のAI音声合成モデルを作りました。
https://huggingface.co/spaces/VIDraft/mouse-webgen
Korean Speech Transcribe(Text) and English Translate(Korean)
Demo for Jasco Model Music Stems Generation
High-quality speech synthesis powered by Kokoro TTS
Transcribe and summarise audio files using AI.
GPT-SoVITS for MITA!
Guided melody accompaniment generation with transformers
Zero Shot voice cloning with llasa 3b (Unofficial Demo)
A humble space for trying EGTTS V0.1
Generate music from lyrics and genre tags
OpenSource Music Generator
Make Custom Voices With KokoroTTS
Mix random MIDI loops into one coherent music composition
Convert text to speech online
Convert spoken words into text
Zero Shot voice cloning with llasa 3b (Unofficial Demo)
Generate soundfonts with latent flow matching
beepbox
Audio Gen, Audio Style Transfer and Audio InPainting
Talk to Fixie.ai's Ultravox with WebRTC ⚡️
This is a text-to-speech and translator app.
Generate customized speech from text using a reference audio
High-quality speech synthesis powered by Kokoro TTS
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
High-Fidelity Simultaneous Speech-To-Speech Translation
Towards Unified Music Emotion Recognition across Dimensional
Generate speech from text with or without cloning a voice
Write musical scores with LLaMA
Separate audio into stems using various models
Separate vocals and accompaniment from audio
Generate audio from text with customizable emotions and settings
Make Custom Voices With KokoroTTS
Speech Synthesis with Zonos
ML-powered speech synthesis directly in your browser
Generate Podcast using Kokoro-TTS!
audio-arena
Generate realistic voice from text
Llasa-1B-Multilingual finetuned using simon3000/genshin-voic
Process audio and generate text output based on instructions
Inpaint pitches in MIDI templates to create unique songs
Frame-level guzheng playing technique detector
Generate audio from text
Generate creative radio Ads with AI.
Generate drum beats from MIDI files
Generate music from MIDI data
Have a video chat with Gemini - it can see you ⚡️
Use RAD-TTS++ model to synthesize text in Ukrainian
Audio to Talking Face
Blazingly Fast and Embarrassingly Simple Song Generation
Blazingly Fast and Embarrassingly Simple Song Generation
Blazingly Fast and Embarrassingly Simple Song Generation
Generate audio from Darija text
Remove noise from audio files
A text-to-speech model powered by SparkAudio and Mobvoi.
transforms your audio files into immersive 360° binaural
Turns your image into matching sound effects
VoiceReplacer
A speech recognition tool for Indic languages.
Generate Japanese TTS audio
(Unofficial) Gradio demo for Spark-TTS
Conversational speech generation
Convert and separate audio using models and TTS
Try Orpheus TTS here
Canary 1B Flash demo
Generate text and speech from audio, video, and text inputs
オグリキャップのAI音声合成モデルを作りました。
アマテ・ユズリハ(マチュ)のAI音声合成モデルを作りました。
Generate a talking-head video from an image and audio
Vote for the most expressive TTS voice
Transcribe Ukrainian audio to text
Transcribe Ukrainian audio to text
Generate speech from text using a reference audio
Try Orpheus TTS here
TTS语音合成系统
Demo of Facebook's MMS Text-to-Speech Model
Controllable Zero-Shot Voice Imitation
Search and upload AI singer models
morpheus tts - uncensored
Generate realistic dialogue from a script, using Dia!
Generate customized text-to-speech audio
Convert text to speech using Kokoro model
Transcribe audio to text with timestamps
Chat with a voice-clone AI
A Step Towards Music Generation Foundation Model
Generate a podcast to discuss the topic of your choice!
Generate realistic talking video from an image and audio
Generates a podcast about today's top trending paper.
Generate audio from text using reference voices
Generate audio from text with customizable voice
Extraction & Reconstruction for Efficient Speech Separation
converts URLs, PDFs, and keywords into professional podcasts
Generate speech from text in multiple languages
Use RAD-TTS++ model to synthesize text in Ukrainian
Voice Activity Detection using MarbleNet model
Expressive Zeroshot TTS
Voice Clone AI Podcast Generator with Chatterbox
State-of-the-art target speech extractor
voice-trans
NotebookLM conversational speech model
Generate detailed music descriptions from audio clips
Get Music from Generated Spectrogram with Diffusion
クロエ・オベールのAI音声合成モデルを作りました。
Stylized TTS – design voice, accent, and emotion your way
Generate a custom song from lyrics and optional prompts
Run V-JEPA 2 on a video stream for Video Classification
mcp_server
Generate an outfit from audio input
MOSS-TTSD: Text to Spoken Dialogue Generation
音声と漢字仮名交じりテキストからふりがなを推定するツール
Audio-Driven Multi-Person Conversational Video Generation
SOTA 8k music transformer trained on 2.31M+ HQ MIDIs
Inpaint drum tracks with Orpheus Music Transformer
Humanize any music score with Orpheus Music Transformer
Seamless music bridges generation with transformers
Solo Piano chords texturing music transformer
Generate audio for a video using captions and descriptions
MIDI Doctor will see your MIDI now :)
Fast Urdu speech recognition app using CPU.
The most accurate Urdu speech recognition app.
Intelligently compare any pair of MIDIs
Mix several MIDI loops into one composition by bridging
High-quality Urdu STT with Faster-Whisper and LLM.
Free Text-To-Speech generator with Emotion control (OpenAI)
Mix monophonic melodies into one composition by bridging
Inpaint pitches in any MIDI composition
Generate speech from text with voice selection
Demo space for Mistral latest speech models
Please vote on TTS Arena V2 instead
MegaTTS 3 but with voice cloning!
State-of-the-art audio transcription in your browser
AI Music Arena & Leaderboard (Suno, Udio, Google, Meta, +)
Upgraded to v1.0!
Generate a song from lyrics and style reference
SOTA Super-tiny TTS Model
granite-speech-3.3-8b in a huggingface space
Enhance audio using text prompts
Remove vocals from videos
Transform and render any MIDI
Generate text and audio responses from images and videos
Try Orpheus TTS here
Galgame-Orpheus-3B-Demo
Galgame-Llasa-8B
Galgame-Llasa-1B-v2
Generate a multi-speaker podcast from a script
Transcribe audio and ask questions about the transcript
A Step Towards Music Generation Foundation Model
An end-to-end speech large language model.
Chat with Xiaomi MiMo-Audio using voice
Interact with a multimodal chatbot using text, audio, images, or video