|
--- |
|
license: apache-2.0 |
|
language: |
|
- es |
|
- sk |
|
tags: |
|
- audio |
|
- parkinsons |
|
- speech |
|
- health |
|
- classification |
|
- contrastive-learning |
|
- attention |
|
- adain |
|
- wavelets |
|
- self-supervised |
|
datasets: |
|
- ewa-db |
|
- pc-gita |
|
model-index: |
|
- name: BDHPD |
|
results: |
|
- task: |
|
type: audio-classification |
|
name: Parkinson's Disease Detection |
|
dataset: |
|
name: EWA-DB (Slovak) |
|
type: ewa-db |
|
metrics: |
|
- type: f1 |
|
value: 69.03 |
|
- type: accuracy |
|
value: 84.72 |
|
- type: sensitivity |
|
value: 56.52 |
|
- type: specificity |
|
value: 88.56 |
|
- task: |
|
type: audio-classification |
|
name: Parkinson's Disease Detection |
|
dataset: |
|
name: PC-GITA (Spanish) |
|
type: pc-gita |
|
metrics: |
|
- type: f1 |
|
value: 90.83 |
|
- type: accuracy |
|
value: 90.83 |
|
- type: sensitivity |
|
value: 93.33 |
|
- type: specificity |
|
value: 88.33 |
|
--- |
|
|
|
# BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech |
|
|
|
This model implements **BDHPD**, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets. |
|
|
|
## Model Description |
|
|
|
BDHPD combines several modern audio processing techniques: |
|
- **Self-supervised learning (SSL)** with models like `microsoft/wavlm-base` |
|
- **Wavelet-based spectrogram features** |
|
- **Adaptive Instance Normalization (AdaIN)** for domain adaptation |
|
- **Convolutional Bottleneck Layers** for feature recalibration |
|
- **Dual-head classification architecture** to handle different speech types (e.g., diadochokinetic and continuous) |
|
- **Contrastive learning** for embedding space refinement |
|
- **Attention pooling** for better sequence summarization |
|
|
|
The architecture supports bilingual inputs and has been evaluated on **EWA-DB** (Slovak) and **PC-GITA** (Spanish). |
|
|
|
## Intended Use |
|
|
|
- **Research** in pathological speech detection |
|
- **Benchmarking** bilingual speech-based PD detection models |
|
- **Development** of real-world diagnostic support tools in healthcare |
|
|
|
## Training |
|
|
|
Training was performed using: |
|
- AdamW optimizer |
|
- Linear learning rate scheduling with warmup |
|
- Binary cross-entropy loss for classification |
|
- Contrastive loss via `pytorch-metric-learning` |
|
- 20 epochs with early stopping |
|
- Balanced batch sampling for both datasets |
|
|
|
## How to Use |
|
|
|
You can find all information on the GitHub repository: [BDHPD GitHub](https://github.com/MorenoLaQuatra/BDHPD) |
|
|
|
## Datasets |
|
|
|
- [**EWA-DB**](https://zenodo.org/records/10952480): Slovak pathological and healthy speech |
|
- [**PC-GITA**](https://aclanthology.org/L14-1549/): Spanish pathological speech |
|
|
|
## Limitations |
|
|
|
- The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested. |
|
- Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation. |
|
- Should not be used as a standalone diagnostic tool. |
|
|
|
## Citation |
|
|
|
If you use this model or find useful this research work, please cite the following paper: |
|
|
|
```bibtex |
|
@inproceedings{laquatra2025bilingual, |
|
title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech}, |
|
author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato}, |
|
booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, |
|
year={2025} |
|
} |
|
``` |