metadata
language: en
license: apache-2.0
tags:
- whisper
- automatic-speech-recognition
- speech
- audio
datasets:
- your-dataset-name
metrics:
- wer
- cer
model-index:
- name: AfroLogicInsect/whisper-finetuned-float32
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: Your Dataset Name
type: your-dataset-type
metrics:
- name: WER
type: wer
value: your-wer-score
AfroLogicInsect/whisper-finetuned-float32
Fine-tuned Whisper model (float32 version) for speech recognition
Model Details
- Model Type: Whisper (Fine-tuned)
- Language: English
- Data Type: float32
- Use Cases: Speech-to-text transcription
Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa
# Load model and processor
processor = WhisperProcessor.from_pretrained("AfroLogicInsect/whisper-finetuned-float32")
model = WhisperForConditionalGeneration.from_pretrained("AfroLogicInsect/whisper-finetuned-float32")
# Load audio
audio, sr = librosa.load("path/to/audio.wav", sr=16000)
# Process
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
# Generate transcription
with torch.no_grad():
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)
Training Details
- Base Model: OpenAI Whisper
- Training Dataset: [Add your dataset details]
- Training Parameters: [Add your training parameters]
- Evaluation Metrics: [Add your evaluation results]
Limitations and Biases
- This model may have biases present in the training data
- Performance may vary on different accents or audio qualities
- Recommended for English speech recognition tasks
Citation
If you use this model, please cite:
@misc{whisper-finetuned,
author = {Daniel AMAH},
title = {Fine-tuned Whisper Model},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/AfroLogicInsect/whisper-finetuned-float32}
}