Speaker Verification model trained on Japanese data.
Install
pip install nemo_toolkit['all']
Inference
import nemo.collections.asr as nemo_asr
speaker_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained("Respair/RyuseiNet")
emb = speaker_model.get_embedding("audio.wav") # speaker embedding
# or
speaker_model.verify_speakers("audio_1.wav","audio_2.wav")
Architecture
Nvidia's Titanet Large
Data
800 ~ 1000 hours
Compute
GH200
- Downloads last month
- 23
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support