DeepSeek-Meeting-Summary

📌 Model Overview

This is a fine-tuned version of DeepSeek-R1-Distill-Llama-8B using Unsloth and LoRA for meeting summarization and structured insights extraction. The model is designed to analyze meeting transcripts and generate structured summaries in JSON format, extracting key elements like summary, topics, actions, problems, and decisions.

🚀 Features

100% valid JSON generation
Trained for long-sequence summarization (16K tokens)
Optimized for structured meeting insights extraction
Fine-tuned with LoRA for efficient training

🔥 Performance Metrics

Metric	Value
ROUGE-L	`0.5217`
BERT-F1	`0.7112`
JSON Validity	`1.0` (100% valid JSON responses)
Validation Loss	`1.6732`

🚀 Usage

1️⃣ Install Dependencies

pip install transformers torch

2️⃣ Load the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "UDZH/deepseek-meeting-summary"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

3️⃣ Run Inference

prompt = """
Analyze the following meeting transcript and extract the key points:
1. **Summarization** – a brief summary of the meeting.
2. **Topics** – a list of topics discussed.
3. **Decisions** – key decisions made.
4. **Problems** – challenges or issues identified.
5. **Actions** – planned or taken actions.

Return the output **STRICTLY in the following JSON format**:
{
  "Summarization": "Brief meeting summary...",
  "Topics": ["Topic 1", "Topic 2"],
  "Actions": ["Action 1", "Action 2"],
  "Problems": ["Problem 1", "Problem 2"],
  "Decisions": ["Decision 1", "Decision 2"]
}

Meeting transcript (in Russian):
{}

**Return only a valid JSON response in Russian language.**
**Do not include explanations, introductions, or extra text.**
**If a category is missing, return an empty array [].**

### Response:
{}
"""

input_text = "Your meeting transcript here"
inputs = tokenizer(prompt.format(input_text, ""), return_tensors="pt", truncation=True, max_length=16384)

with torch.no_grad():
    output_ids = model.generate(**inputs, max_new_tokens=500)

response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print("Generated Summary:", response)

📌 License & Citation

This model is fine-tuned for research and production use. If you use it in your projects, please cite this repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UDZH/deepseek-meeting-summary

Base model

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Quantized

unsloth/DeepSeek-R1-Distill-Llama-8B-bnb-4bit

Finetuned

(4)

this model