Speaker Verification model trained on Japanese data.

Install

pip install nemo_toolkit['all']

Inference

import nemo.collections.asr as nemo_asr
speaker_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained("Respair/RyuseiNet")
emb = speaker_model.get_embedding("audio.wav") # speaker embedding
# or
speaker_model.verify_speakers("audio_1.wav","audio_2.wav")

Architecture

Nvidia's Titanet Large

Data

800 ~ 1000 hours

Compute

GH200

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support