YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

mlx-community/supertonic-2

This model was converted to MLX format from Supertone/supertonic-2 using mlx-audio version 0.2.8.

SuperTonic 2 is a high-quality text-to-speech model with voice style control.

Use with mlx-audio

pip install -U mlx-audio

CLI Example:

mlx_audio.tts.generate --model mlx-community/supertonic-2 --text "Hello, this is a test." --voice M1

Python Example:

from mlx_audio.tts.utils import load_model

model = load_model("mlx-community/supertonic-2")
for result in model.generate("Hello, this is a test.", voice="M1"):
    print(f"Generated {result.audio_duration} of audio")

Model Details

  • Architecture: Text encoder + Duration predictor + Flow matching (vector field) + Vocoder
  • Sample rate: 44100 Hz
  • Voices: M1-M5, F1-F5 (10 built-in voice styles)
  • Latent dim: 24 (compressed to 144 via chunking)
  • Flow matching steps: 10 (configurable)
Downloads last month
59
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support