PlayDialog 1.0

#83
by legofan94 - opened

We've released PlayDialog 1.0, a huge rework since Dialog Beta (which is currently on the leaderboard). Could we add in Dialog 1.0?

https://x.com/play_ht/status/1887967775207121021

TTS AGI org

Hey,
Thanks for reaching out! We can definitely upgrade the model to v1.0. Has the PlayDialog-http model automatically been upgraded from Beta to v1.0 in the Python API?

Is it possible to relaunch it as a new model? Are you able to share the params you are currently passing through and the endpoint you are hitting?

If you're using our Python SDK (pyht), I recommend upgrading to the latest version and then calling tts() with voice_engine='PlayDialog' and protocol='http' instead of voice_engine='PlayDialog-http' as we changed the API to separate them. Otherwise it will be the same.

TTS AGI org

Is it possible to relaunch it as a new model? Are you able to share the params you are currently passing through and the endpoint you are hitting?

Yes, we can relaunch as a new model. Should it be labelled as PlayDialog 1.0?

The current params being used:

for chunk in play_client.tts(text, TTSOptions(voice="s3://voice-cloning-zero-shot/831bd330-85c6-4333-b2b4-10c476ea3491/original/manifest.json"), voice_engine="PlayDialog-http"):

That is the correct labeling! Will wait for @bryananderson to confirm -- but a question for you @mrfakename is how do you pick voices / do you cycle through them? Do you only pick one? Should we provide you a list?

Bumping @mrfakename

Can you share the params for the python sdk payload?

TTS AGI org
โ€ข
edited 1 day ago

Hey @legofan94 ,
So sorry about the delay. Here's the code used for generation:

voice_engine = "PlayDialog"
tts_options = TTSOptions(voice="s3://voice-cloning-zero-shot/831bd330-85c6-4333-b2b4-10c476ea3491/original/manifest.json")
for chunk in play_client.tts(text, tts_options, voice_engine=voice_engine):
    if chunk == b'':
        play_client.close()
        break
    f.write(chunk)
    return f.name, None

Working on getting it added now! Are these the right params to use?

Almost! Can you use this voice instead of the one you have in params? It's a good neutral voice if you're only taking one. If you want to have different accents and speaker styles we can provide more.

@mrfakename

s3://voice-cloning-zero-shot/42c41808-0ddb-4674-8965-024a52ad6c8e/original/manifest.json

TTS AGI org

Makes sense, switched to that voice. Are there any other settings that need to be adjusted before launch?

Nope -- you're otherwise good!

TTS AGI org
โ€ข
edited 1 day ago

Should be live shortly!

Sign up or log in to comment