felix-schneider's picture
Upload folder using huggingface_hub
cd0f743 verified
metadata
library_name: transformers
tags: []
model_index:
  - name: Llama-speechlmm-1.0-l-SSUM
    results: []

Model Information

This is the version of meetween/Llama-speechlmm-1.0-l that was fine-tuned for Speech Summarization.

License: see LICENSE

Model Architecture

Identical to base model. This model does not include a video adapter.

This model was obtained by fine-tuning the adapter and a LoRA on the decoder. This repository contains the weights with the LoRA merged into the main weights.

How to Use

Identical to base model.

Training Data

This model has been fine-tuned on the same AMI and ICSI speech summarization data from the training data of the base model.

Evaluation Results

Model Name Topic
Segmentation
Summary
of
Summaries
ICSI
R-1 R-2 R-L
Cascade (Whisper + Textual summ.)
Base Model No No 27.6 3.8 25.3
Base Model Yes No 25.9 5.3 23.8
Base Model Yes Yes 21.1 2.3 18.3
meetween/Llama-speechlmm-1.0-l-TSUM No No 31.0 4.3 27.6
This Model No No 28.9 3.8 26.0
end-to-end directly from audio
Base Model N/A N/A 26.6 3.5 23.9
+LoRA decoder N/A N/A 27.9 3.3 25.2
+adapter finetune +LoRA decoder (this model) No No 32.1 4.1 29.1

Framework versions

  • Transformers 4.45.0
  • Pytorch 2.3.1+cu124.post2
  • Datasets 3.2.0
  • Tokenizers 0.20.0