bdhpd-ewa-pcgita / README.md
MorenoLaQuatra
Initial commit: upload BDHPD model
bdac85b
metadata
license: apache-2.0
language:
  - es
  - sk
tags:
  - audio
  - parkinsons
  - speech
  - health
  - classification
  - contrastive-learning
  - attention
  - adain
  - wavelets
  - self-supervised
datasets:
  - ewa-db
  - pc-gita
model-index:
  - name: BDHPD
    results:
      - task:
          type: audio-classification
          name: Parkinson's Disease Detection
        dataset:
          name: EWA-DB (Slovak)
          type: ewa-db
        metrics:
          - type: f1
            value: 69.03
          - type: accuracy
            value: 84.72
          - type: sensitivity
            value: 56.52
          - type: specificity
            value: 88.56
      - task:
          type: audio-classification
          name: Parkinson's Disease Detection
        dataset:
          name: PC-GITA (Spanish)
          type: pc-gita
        metrics:
          - type: f1
            value: 90.83
          - type: accuracy
            value: 90.83
          - type: sensitivity
            value: 93.33
          - type: specificity
            value: 88.33

BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech

This model implements BDHPD, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets.

Model Description

BDHPD combines several modern audio processing techniques:

  • Self-supervised learning (SSL) with models like microsoft/wavlm-base
  • Wavelet-based spectrogram features
  • Adaptive Instance Normalization (AdaIN) for domain adaptation
  • Convolutional Bottleneck Layers for feature recalibration
  • Dual-head classification architecture to handle different speech types (e.g., diadochokinetic and continuous)
  • Contrastive learning for embedding space refinement
  • Attention pooling for better sequence summarization

The architecture supports bilingual inputs and has been evaluated on EWA-DB (Slovak) and PC-GITA (Spanish).

Intended Use

  • Research in pathological speech detection
  • Benchmarking bilingual speech-based PD detection models
  • Development of real-world diagnostic support tools in healthcare

Training

Training was performed using:

  • AdamW optimizer
  • Linear learning rate scheduling with warmup
  • Binary cross-entropy loss for classification
  • Contrastive loss via pytorch-metric-learning
  • 20 epochs with early stopping
  • Balanced batch sampling for both datasets

How to Use

You can find all information on the GitHub repository: BDHPD GitHub

Datasets

  • EWA-DB: Slovak pathological and healthy speech
  • PC-GITA: Spanish pathological speech

Limitations

  • The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested.
  • Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation.
  • Should not be used as a standalone diagnostic tool.

Citation

If you use this model or find useful this research work, please cite the following paper:

@inproceedings{laquatra2025bilingual,
  title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech},
  author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato},
  booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2025}
}