`imprt/izanami-wav2vec2-base`

This is a Japanese wav2vec2.0 Base model pre-trained using 5313 hours of audio extracted from large-scale Japanese TV broadcast audio data by voice activity detection.
This model was trained using code from the official repository.

Usage

import soundfile as sf
from transformers import AutoFeatureExtractor
model = "imprt/izanami-wav2vec2-base"
feature_extractor = AutoFeatureExtractor.from_pretrained(model)
audio_file="/path/to/16k_audio_file"
audio_input, sr = sf.read(audio_file)
feature_extractor(audio_input, sampling_rate=sr)

References

@inproceedings{NEURIPS2020_92d1e1eb,
    author = {Baevski, Alexei and Zhou, Yuhao and Mohamed, Abdelrahman and Auli, Michael},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
    pages = {12449--12460},
    publisher = {Curran Associates, Inc.},
    title = {wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations},
    url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/92d1e1eb1cd6f9fba3227870bb6d7f07-Paper.pdf},
    volume = {33},
    year = {2020}
}

License / Terms

Read LICENSE when you use this model.

imprt
/

izanami-wav2vec2-base

You need to agree to share your contact information to access this model

`imprt/izanami-wav2vec2-base`

Usage

References

License / Terms

Model tree for imprt/izanami-wav2vec2-base

Collection including imprt/izanami-wav2vec2-base

SSL model