Model Information

This is the version of meetween/Llama-speechlmm-1.0-l that was fine-tuned for Spoken Language Understanding.

License: see https://huggingface.co/meetween/Llama-speechlmm-1.0-l/blob/main/LICENSE

Model Architecture

Identical to base model. This model does not include a video adapter.

This model was obtained by fine-tuning the speech adapter and LoRA on the textdecoder. This repository contains the weights of LoRA merged into the main weights.

How to use

Identical to the base model.

Training data

The model was fine tuned on the same data sets used for training the main model.

Number of samples (hours): 40 (SLURP) + 25 (SpeechMassive)

= 65 in total

Evaluation results (%Intent Accuracy)

SpeechMassive (de) SpeechMassive (fr) SLURP (en)
Base model 84.6 86.6 78.1
SpeechLMM_v1.0_L_FT 81.3 82.1 74.6

Framework versions

Transformers 4.45.0

Pytorch 2.3.1+cu124.post2

Datasets 3.2.0

Tokenizers 0.20.0

Compute Infrastructure: see https://www.cyfronet.pl/en/18377,artykul,plgrid_infrastructure.html

Downloads last month
4
Safetensors
Model size
8.98B params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including meetween/Llama-speechlmm-1.0-l-SLU