metadata
library_name: peft
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- generated_from_trainer
datasets:
- mozilla-foundation/common_voice_11_0
- tunis-ai/arabic_speech_corpus
model-index:
- name: lowhipa-large-sr
results: []
pipeline_tag: automatic-speech-recognition
language:
- acy
lowhipa-large-sr (Sanna Related)
This Whisper-for-IPA (WhIPA) model adapter is a PEFT LoRA fine-tuned version of openai/whisper-large-v2 on a subset of:
- CommonVoice11 dataset (1k samples each from Greek, Maltese) with G2P-based IPA transcriptions
- Arabic Speech Corpus (https://en.arabicspeechcorpus.com) with custom IPA transcriptions transliterated from the provided Buckwalter transcriptions (1k samples) (https://doi.org/10.5281/zenodo.17111977)
Model description
For deployment and description, please refer to https://github.com/jshrdt/whipa.
from transformers import WhisperForConditionalGeneration, WhisperTokenizer, WhisperProcessor
from peft import PeftModel
tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-large-v2", task="transcribe")
tokenizer.add_special_tokens({"additional_special_tokens": ["<|ip|>"] + tokenizer.all_special_tokens})
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v2")
base_model.generation_config.lang_to_id["<|ip|>"] = tokenizer.convert_tokens_to_ids(["<|ip|>"])[0]
base_model.resize_token_embeddings(len(tokenizer))
whipa_model = PeftModel.from_pretrained(base_model, "jshrdt/lowhipa-large-sr")
whipa_model.generation_config.language = "<|ip|>"
whipa_model.generation_config.task = "transcribe"
whipa_processor = WhisperProcessor.from_pretrained("openai/whisper-large-v2", task="transcribe")
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
Training results
Training Loss | Epoch | Validation Loss |
---|---|---|
0.4344 | 2.0323 | 0.3692754805088043 |
0.1875 | 4.0645 | 0.3102695643901825 |
0.0717 | 6.0968 | 0.30600059032440186 |
0.0202 | 8.1290 | 0.32697898149490356 |
0.0101 | 10.1613 | 0.34040552377700806 |
Framework versions
- PEFT 0.15.1
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0