bdhpd-ewa-pcgita / README.md
MorenoLaQuatra
Initial commit: upload BDHPD model
bdac85b
---
license: apache-2.0
language:
- es
- sk
tags:
- audio
- parkinsons
- speech
- health
- classification
- contrastive-learning
- attention
- adain
- wavelets
- self-supervised
datasets:
- ewa-db
- pc-gita
model-index:
- name: BDHPD
results:
- task:
type: audio-classification
name: Parkinson's Disease Detection
dataset:
name: EWA-DB (Slovak)
type: ewa-db
metrics:
- type: f1
value: 69.03
- type: accuracy
value: 84.72
- type: sensitivity
value: 56.52
- type: specificity
value: 88.56
- task:
type: audio-classification
name: Parkinson's Disease Detection
dataset:
name: PC-GITA (Spanish)
type: pc-gita
metrics:
- type: f1
value: 90.83
- type: accuracy
value: 90.83
- type: sensitivity
value: 93.33
- type: specificity
value: 88.33
---
# BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech
This model implements **BDHPD**, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets.
## Model Description
BDHPD combines several modern audio processing techniques:
- **Self-supervised learning (SSL)** with models like `microsoft/wavlm-base`
- **Wavelet-based spectrogram features**
- **Adaptive Instance Normalization (AdaIN)** for domain adaptation
- **Convolutional Bottleneck Layers** for feature recalibration
- **Dual-head classification architecture** to handle different speech types (e.g., diadochokinetic and continuous)
- **Contrastive learning** for embedding space refinement
- **Attention pooling** for better sequence summarization
The architecture supports bilingual inputs and has been evaluated on **EWA-DB** (Slovak) and **PC-GITA** (Spanish).
## Intended Use
- **Research** in pathological speech detection
- **Benchmarking** bilingual speech-based PD detection models
- **Development** of real-world diagnostic support tools in healthcare
## Training
Training was performed using:
- AdamW optimizer
- Linear learning rate scheduling with warmup
- Binary cross-entropy loss for classification
- Contrastive loss via `pytorch-metric-learning`
- 20 epochs with early stopping
- Balanced batch sampling for both datasets
## How to Use
You can find all information on the GitHub repository: [BDHPD GitHub](https://github.com/MorenoLaQuatra/BDHPD)
## Datasets
- [**EWA-DB**](https://zenodo.org/records/10952480): Slovak pathological and healthy speech
- [**PC-GITA**](https://aclanthology.org/L14-1549/): Spanish pathological speech
## Limitations
- The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested.
- Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation.
- Should not be used as a standalone diagnostic tool.
## Citation
If you use this model or find useful this research work, please cite the following paper:
```bibtex
@inproceedings{laquatra2025bilingual,
title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech},
author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato},
booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2025}
}
```