alkiskoudounas
/

xls-r-128-speechmassive-fr-FR

Audio Classification

intent-classification

Model card Files Files and versions Community

xls-r-128-speechmassive-fr-FR / README.md

alkiskoudounas's picture

Updated README

c814006 verified 2 months ago

|

history blame contribute delete

2.07 kB

	---
	task_categories:
	- audio-classification
	language:
	- fr
	tags:
	- intent
	- intent-classification
	- audio-classification
	- audio
	base_model:
	- facebook/wav2vec2-xls-r-300m
	datasets:
	- FBK-MT/Speech-MASSIVE
	library_name: transformers
	license: apache-2.0
	---

	# wav2vec 2.0 XLS-R 128 (300m) fine-tuned on Speech-MASSIVE - fr-FR

	Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSIVE textual corpus.
	Speech-MASSIVE covers 12 languages.
	It includes spoken and written utterances and is annotated with 60 intents.
	The dataset is available on [HuggingFace Hub](https://huggingface.co/datasets/FBK-MT/Speech-MASSIVE).

	This is the [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) model fine-tuned on the fr-FR language.

	It achieves the following results on the test set:

	- Accuracy: 0.543
	- F1: 0.410

	## Usage

	You can use the model directly in the following manner:

	```python
	import torch
	import librosa
	from transformers import AutoModelForAudioClassification, AutoFeatureExtractor

	## Load an audio file
	audio_array, sr = librosa.load("path_to_audio.wav", sr=16000)

	## Load model and feature extractor
	model = AutoModelForAudioClassification.from_pretrained("alkiskoudounas/xls-r-128-speechmassive-fr-FR")
	feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-xls-r-300m")

	## Extract features
	inputs = feature_extractor(audio_array.squeeze(), sampling_rate=feature_extractor.sampling_rate, padding=True, return_tensors="pt")

	## Compute logits
	logits = model(**inputs).logits
	```

	## Framework versions

	- Datasets 3.2.0
	- Pytorch 2.1.2
	- Tokenizers 0.20.3
	- Transformers 4.45.2

	## BibTeX entry and citation info

	```bibtex
	@inproceedings{koudounas2025unlearning,
	title={"Alexa, can you forget me?" Machine Unlearning Benchmark in Spoken Language Understanding},
	author={Koudounas, Alkis and Savelli, Claudio and Giobergia, Flavio and Baralis, Elena},
	booktitle={Proc. Interspeech 2025},
	year={2025},
	}
	```