Spaces:
Runtime error
Runtime error
File size: 4,467 Bytes
432a474 d2ae73d 432a474 8ba4308 6680f24 87ff28a 633a175 8ba4308 87ff28a 9100090 633a175 87ff28a 9751248 cb57d96 9751248 87ff28a e898abd 9100090 e898abd de305ed 0f77dec 5fae21a 0f77dec e898abd 0f77dec e898abd 048c3fc fb9daf9 e898abd b850013 de305ed dfbb840 e898abd 0f77dec e898abd 87ff28a f420a37 8ba4308 f420a37 87ff28a f420a37 87ff28a f420a37 557e7ca 1ed6720 557e7ca 8ba4308 1ed6720 87ff28a f420a37 633a175 f420a37 8ba4308 87ff28a 633a175 87ff28a 9100090 f420a37 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
title: Expressive TTS Arena
emoji: π€
colorFrom: indigo
colorTo: pink
sdk: docker
app_file: src/main.py
python_version: "3.11"
pinned: true
license: mit
---
<div align="center">
<img src="https://storage.googleapis.com/hume-public-logos/hume/hume-banner.png">
<h1>Expressive TTS Arena</h1>
<p>
<strong> A web application for comparing and evaluating the expressiveness of different text-to-speech models </strong>
</p>
</div>
## Overview
Expressive TTS Arena is an open-source web application for evaluating the expressiveness of voice generation and speech synthesis from different text-to-speech providers, including Hume AI and Elevenlabs.
For support or to join the conversation, visit our [Discord](https://discord.com/invite/humeai).
## Prerequisites
- [Python >=3.11.11](https://www.python.org/downloads/)
- [pip >=25.0](https://pypi.org/project/pip/)
- [uv >=0.5.29](https://github.com/astral-sh/uv)
- [Postgres](https://www.postgresql.org/download/)
- API keys for Hume AI, Anthropic, and ElevenLabs
## Project Structure
```
Expressive TTS Arena/
βββ public/ # Directory for public assets
βββ src/
β βββ database/
β β βββ __init__.py # Makes database a package; expose ORM methods
β β βββ crud.py # Defines operations for interacting with database
β β βββ database.py # Sets up SQLAlchemy database connection
β β βββ models.py # SQLAlchemy database models
β βββ integrations/
β β βββ __init__.py # Makes integrations a package; exposes API clients
β β βββ anthropic_api.py # Anthropic API integration
β β βββ elevenlabs_api.py # ElevenLabs API integration
β β βββ hume_api.py # Hume API integration
β βββ scripts/
β β βββ __init__.py # Makes scripts a package
β β βββ init_db.py # Script for initializing database
β β βββ test_db.py # Script for testing database connection
β βββ __init__.py # Makes src a package
β βββ config.py # Global config and logger setup
β βββ constants.py # Global constants
β βββ custom_types.py # Global custom types
β βββ frontend.py # Gradio UI components
β βββ main.py # Entry file
β βββ utils.py # Utility functions
βββ static/
β βββ audio/ # Directory for storing generated audio files
β βββ css/
β β βββ styles.css # Defines custom css
βββ .dockerignore
βββ .env.example
βββ .gitignore
βββ .pre-commit-config.yaml
βββ Dockerfile
βββ LICENSE.txt
βββ pyproject.toml
βββ README.md
βββ uv.lock
```
## Installation
1. This project uses the [uv](https://docs.astral.sh/uv/) package manager. Follow the installation instructions for your platform [here](https://docs.astral.sh/uv/getting-started/installation/).
2. Configure environment variables:
- Create a `.env` file based on `.env.example`
- Add your API keys:
```txt
HUME_API_KEY=YOUR_HUME_API_KEY
ANTHROPIC_API_KEY=YOUR_ANTHROPIC_API_KEY
ELEVENLABS_API_KEY=YOUR_ELEVENLABS_API_KEY
```
3. Run the application:
Standard
```sh
uv run python -m src.main
```
With hot-reloading
```sh
uv run watchfiles "python -m src.main" src
```
4. Test the application by navigating to the the localhost URL in your browser (e.g. `localhost:7860` or `http://127.0.0.1:7860`)
5. (Optional) If contributing, install pre-commit hook for automatic linting, formatting, and type-checking:
```sh
uv run pre-commit install
```
## User Flow
1. Select a sample character, or input a custom character description and click **"Generate Text"**, to generate your text input.
2. Click the **"Synthesize Speech"** button to synthesize two TTS outputs based on your text and character description.
3. Listen to both audio samples to compare their expressiveness.
4. Vote for the most expressive result by clicking either **"Select Option A"** or **"Select Option B"**.
## License
This project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.txt) file for details.
|