alkiskoudounas
/

xls-r-128-speechmassive-fr-FR

Audio Classification

intent-classification

Model card Files Files and versions Community

alkiskoudounas commited on Mar 24

Commit

a2968a2

·

verified ·

1 Parent(s): 95d146f

Created README

Files changed (1) hide show

README.md +49 -0

README.md ADDED Viewed

	@@ -0,0 +1,49 @@

+---
+task_categories:
+- audio-classification
+language:
+- fr
+tags:
+- intent
+- intent-classification
+- audio-classification
+- audio
+base_model:
+- facebook/wav2vec2-xls-r-300m
+datasets:
+- FBK-MT/Speech-MASSIVE
+library_name: transformers
+license: apache-2.0
+---
+# wav2vec 2.0 XLS-R 128 (300m) fine-tuned on Speech-MASSIVE - fr-FR
+Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSIVE textual corpus.
+Speech-MASSIVE covers 12 languages.
+It includes spoken and written utterances and is annotated with 60 intents.
+The dataset is available on [HuggingFace Hub](https://huggingface.co/datasets/FBK-MT/Speech-MASSIVE).
+This is the [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) model fine-tuned on the fr-FR language.
+## Usage
+You can use the model directly in the following manner:
+```python
+import torch
+import librosa
+from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
+## Load an audio file
+audio_array, sr = librosa.load("path_to_audio.wav", sr=16000)
+## Load model and feature extractor
+model = AutoModelForAudioClassification.from_pretrained("alkiskoudounas/xls-r-128-speechmassive-fr-FR")
+feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-xls-r-300m")
+## Extract features
+inputs = feature_extractor(audio_array.squeeze(), sampling_rate=feature_extractor.sampling_rate, padding=True, return_tensors="pt")
+## Compute logits
+logits = model(**inputs).logits
+```