MADRS-BERT

MADRS-BERT is a fine-tuned bert-base-german-cased model that predicts depression severity scores (0–6) across individual items of the Montgomery-Åsberg Depression Rating Scale (MADRS). Each prediction is based on transcribed, structured clinician–patient interview segments.

This model was developed to support standardized, scalable mental health assessments in both clinical and low-resource settings.

Model Details

  • Base model: bert-base-german-cased
  • Task: Ordinal regression (scores 0–6)
  • Language: German
  • Input: Text (dialogue segment grouped by MADRS topic)
  • Output: Predicted score for each MADRS item (rounded integer 0–6)
  • Training data: Mix of real and synthetic clinician–patient interviews (MADRS-structured)

Intended Use

This model is intended for research and development use. It is not a certified medical device. The goal is to:

  • Explore AI-assisted symptom severity assessment
  • Enable structured evaluation of individual MADRS items
  • Support clinicians or researchers working in psychiatry/mental health

🚀 How to Use

Preprocess Data File:

Please organize your data equivalent to the example data (synthetic data) with columns: Subject, Speaker, Transcription, Topic, Score.


import pandas as pd

def load_and_prepare_conversations(filepath):
    df = pd.read_excel(filepath)
    conversations = []

    for topic in df['Topic'].unique():
        topic_df = df[df['Topic'] == topic]
        if topic_df.empty: continue

        dialogue = "\n".join([
            f"{row['Speaker']}: {row['Transcription']}"
            for _, row in topic_df.iterrows()
            if pd.notnull(row['Transcription'])
        ])

        conversations.append((topic, dialogue))
    return conversations

Load model and tokenizer:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "webersamantha/MADRS-BERT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval().to("cuda" if torch.cuda.is_available() else "cpu")

Predict on a full structured interview / Run inference:

Assume you have a conversation log like this:

def predict_madrs_scores(conversations, tokenizer, model):
    device = model.device
    predictions = {}
    
    for topic, dialogue in conversations:
        inputs = tokenizer(dialogue, truncation=True, padding="max_length", max_length=512, return_tensors="pt").to(device)
        with torch.no_grad():
            score = torch.round(model(**inputs).logits).clamp(0, 6).item()
        predictions[topic] = score

    return predictions

file_path = "example_interview.xlsx"
conversations = load_and_prepare_conversations(file_path)
scores = predict_madrs_scores(conversations, tokenizer, model)
print(scores)

Acknowledgements

Model trained and released by Samantha Weber within the framework of the Multicast Project on predicting and treating suicidality. Research conducted as part of efforts to improve AI-driven mental health tools. Thanks to all clinicians and collaborators who contributed to the annotated MADRS dataset.

Evaluation

The model was evaluated on a held-out clinical validation set and achieved strong performance under both strict and flexible scoring criteria (±1 deviation tolerance). See publication for full metrics.

Citation

If you use this model, please cite:

Weber, S. et al. (2025). "Using a Fine-tuned Large Language Model for Symptom-based Depression Evaluation" Preprint. https://doi.org/10.21203/rs.3.rs-6555767/v1

Downloads last month
18
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for webesama/MADRS-BERT

Finetuned
(129)
this model