You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Please read LICENSE.md before downloading this model.

Log in or Sign Up to review the conditions and access this model content.

imprt/izanami-wav2vec2-base

This is a Japanese wav2vec2.0 Base model pre-trained using 5313 hours of audio extracted from large-scale Japanese TV broadcast audio data by voice activity detection.
This model was trained using code from the official repository.

Usage

import soundfile as sf
from transformers import AutoFeatureExtractor
model = "imprt/izanami-wav2vec2-base"
feature_extractor = AutoFeatureExtractor.from_pretrained(model)
audio_file="/path/to/16k_audio_file"
audio_input, sr = sf.read(audio_file)
feature_extractor(audio_input, sampling_rate=sr)

References

@inproceedings{NEURIPS2020_92d1e1eb,
    author = {Baevski, Alexei and Zhou, Yuhao and Mohamed, Abdelrahman and Auli, Michael},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
    pages = {12449--12460},
    publisher = {Curran Associates, Inc.},
    title = {wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations},
    url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/92d1e1eb1cd6f9fba3227870bb6d7f07-Paper.pdf},
    volume = {33},
    year = {2020}
}

License / Terms

Read LICENSE when you use this model.

Downloads last month
152
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for imprt/izanami-wav2vec2-base

Finetunes
1 model

Collection including imprt/izanami-wav2vec2-base