Sakinah-AI: Optimized AraBERT for Arabic Mental Health Question Classification
This repository contains the official fine-tuned model Sakinah-AI-AraBERT-Optimized, one of our submissions to the MentalQA 2025 Shared Task (Track 1).
By: Fatimah Emad Elden & Mumina Abukar
Cairo University & The University of South Wales
π Model Description
This model is a fine-tuned version of aubmindlab/bert-base-arabertv2
for multi-label classification of Arabic questions related to mental health. It was trained on the AraHealthQA
dataset.
Our approach involved a comprehensive hyperparameter search using the Optuna framework to find the optimal configuration. To address class imbalance, the model was trained using a custom Focal Loss function. This optimized fine-tuning approach significantly outperformed its k-fold ensemble counterpart. On the official blind test set, this model achieved a Weighted F1-score of 0.543.
The model predicts one or more of the following labels for a given question:
- A: Diagnosis (Interpreting symptoms)
- B: Treatment (Seeking therapies or medications)
- C: Anatomy and Physiology (Basic medical knowledge)
- D: Epidemiology (Course, prognosis, causes of diseases)
- E: Healthy Lifestyle (Diet, exercise, mood control)
- F: Provider Choices (Recommendations for doctors)
- Z: Other (Does not fit other categories)
π How to Use
You can use this model directly with the transformers
library pipeline for text-classification
.
from transformers import pipeline
# Load the classification pipeline
classifier = pipeline(
"text-classification",
model="FatimahEmadEldin/Sakinah-AI-AraBERT-Optimized",
return_all_scores=True # Set to True for multi-label output
)
# Example question in Arabic
question = "Ω
Ψ§ ΩΩ Ψ£ΨΉΨ±Ψ§ΨΆ Ψ§ΩΨ§ΩΨͺΨ¦Ψ§Ψ¨ ΩΩΩΩ ΩΩ
ΩΩ ΨΉΩΨ§Ψ¬ΩΨ"
# (Translation: "What are the symptoms of depression and how can it be treated?")
results = classifier(question)
# --- Post-processing to get final labels ---
# The optimal threshold must be determined from your Optuna study results.
# The evaluation script uses a placeholder of 0.45.
# Replace with the actual best_params['base_threshold'] value.
threshold = 0.45
predicted_labels = [item['label'] for item in results[0] if item['score'] > threshold]
print(f"Question: {question}")
# Expected output for this example would likely include 'Diagnosis' and 'Treatment'
print(f"Predicted Labels: {predicted_labels}")
# Expected: ['A', 'B']
βοΈ Training Procedure
This model was fine-tuned using a rigorous hyperparameter optimization process.
Hyperparameters
The best hyperparameters were found by Optuna during the training process (arabert_optmized.py
). You will need to retrieve these values from the output of your Optuna study (study.best_params
).
Of course! Here is the information formatted into Markdown tables.
Optimization Results
Metric | Value |
---|---|
Best trial F1 Score | 0.6307 |
Best Hyperparameters Found
Hyperparameter | Value |
---|---|
learning_rate | 5.273957732715589e-05 |
num_train_epochs | 13 |
weight_decay | 0.04131058607286182 |
focal_alpha | 0.9702303056621574 |
focal_gamma | 1.39543909126709 |
base_threshold | 0.20408644287720523 |
Frameworks
- PyTorch
- Hugging Face Transformers
- Optuna
π Evaluation Results
The model was evaluated on the blind test set provided by the MentalQA organizers.
Final Test Set Scores
Metric | Score |
---|---|
Weighted F1-Score | 0.543 |
Per-Label Performance (Test Set)
Note: The following is a placeholder. To generate the actual report, run the arabert_evaluate.py
script with your final model and the official test data.
--- Per-Label Performance (Test Set) ---
precision recall f1-score support
A 0.65 0.81 0.72 84
B 0.60 0.75 0.67 85
C 0.00 0.00 0.00 10
D 0.37 0.21 0.26 34
E 0.41 0.37 0.39 38
F 0.00 0.00 0.00 6
Z 0.00 0.00 0.00 3
micro avg 0.58 0.59 0.58 260
macro avg 0.29 0.31 0.29 260
weighted avg 0.51 0.59 0.54 260
samples avg 0.65 0.65 0.60 260
π Citation
If you use our work, please cite our paper:
@inproceedings{elden2025sakinahai,
title={{Sakinah-AI at MentalQA: A Comparative Study of Few-Shot, Optimized, and Ensemble Methods for Arabic Mental Health Question Classification}},
author={Elden, Fatimah Emad and Abukar, Mumina},
year={2025},
booktitle={Proceedings of the MentalQA 2025 Shared Task},
eprint={25XX.XXXXX},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 11
Model tree for FatimahEmadEldin/Sakinah-AI-AraBERT-Optimized
Base model
aubmindlab/bert-base-arabertv2