Automatic Speech Recognition
Transformers
Safetensors
PEFT
Greek
whisper
fine-tuned

Fine-Tuned Whisper Large

This is a Large-sized Whisper model fine-tuned for Greek speech transcription. It has 1.5B parameters and achieves improved transcription performance over the medium model.

  • WER: 12.06%
  • CER: 6.20%

Training Results

Step Training Loss Validation Loss WER CER
250 0.1776 0.1904 13.52% 6.74%
500 0.1478 0.1698 12.55% 6.38%
750 0.1229 0.1608 12.33% 6.24%
1000 0.1057 0.1605 12.15% 6.26%
1250 0.0864 0.1630 12.65% 6.65%
1500 0.0677 0.1643 13.23% 7.35%
1750 0.0618 0.1681 12.86% 6.86%
2000 0.0533 0.1686 12.98% 7.00%

Model Details

  • Model Type: Whisper (Large)
  • Fine-tuned From: OpenAI Whisper Large
  • Language(s): Greek
  • Parameters: 1.5B

How to Use

from transformers import WhisperProcessor, WhisperForConditionalGeneration
from peft import PeftModel
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

# Load base model and Greek fine-tuned LoRA weights
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v2").to(device)
model = PeftModel.from_pretrained(base_model, "Vardis/Whisper-Large-v2-Greek").to(device)
processor = WhisperProcessor.from_pretrained("Vardis/Whisper-Large-v2-Greek")

# Load your audio waveform (e.g., using librosa or torchaudio)
audio_input = ...  

# Generate transcription
inputs = processor(audio_input, return_tensors="pt").input_features.to(device)
predicted_ids = model.generate(inputs)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)

print(transcription)

Context / Reference

This model was developed as part of the work described in:

Georgilas, V., Stafylakis, T. (2025). Automatic Speech Recognition for Greek Medical Dictation.
The paper focuses on Greek medical ASR research in general and is not primarily about the model itself, but provides context for its development. Users are welcome to use the model freely for research and practical applications.

BibTeX citation:

@misc{georgilas2025greekasr,
  title={Automatic Speech Recognition for Greek Medical Dictation},
  author={Vardis Georgilas and Themos Stafylakis},
  year={2025},
  note={Available at: https://www.arxiv.org/abs/2509.23550}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Vardis/Whisper-Large-v2-Greek

Finetuned
(93)
this model

Datasets used to train Vardis/Whisper-Large-v2-Greek