SpeechLMM v1
Collection
1st generation of SpeechLMM models, capable of ingesting video, audio and text and generate text as output. From the Meetween consortium (meetween.eu)
•
12 items
•
Updated
This is the version of meetween/Llama-speechlmm-1.0-l that was fine-tuned for Text Summarization.
License: see LICENSE
Identical to base model. This model does not include an audio adapter or a video adapter.
This model was obtained by fine-tuning a LoRA on the decoder. This repository contains the weights with the LoRA merged into the main weights.
Identical to base model.
This model has been fine-tuned on the same AMI and ICSI text summarization data from the training data of the base model.
Model Name | Topic Segmentation |
Summary of Summaries |
ICSI | AutoMin | ||||
---|---|---|---|---|---|---|---|---|
R-1 | R-2 | R-L | R-1 | R-2 | R-L | |||
Base Model | No | No | 26.2 | 3.2 | 23.3 | 28.3 | 2.9 | 25.9 |
Base Model | Yes | No | 21.9 | 4.8 | 20.2 | 21.3 | 4.7 | 20.4 |
Base Model | Yes | Yes | 22.3 | 2.0 | 18.9 | 22.3 | 2.7 | 20.6 |
+LoRA decoder (this model) | No | No | 32.1 | 4.5 | 24.2 | 31.4 | 4.6 | 29.4 |
Yes | No | 23.1 | 6.3 | 21.7 | 22.8 | 6.1 | 21.7 | |
Yes | Yes | 27.9 | 3.6 | 25.8 | 27.7 | 3.8 | 25.9 |