bdhpd-ewa-pcgita / README.md

MorenoLaQuatra

Initial commit: upload BDHPD model

bdac85b 2 months ago

3.57 kB

metadata

license: apache-2.0
language:
  - es
  - sk
tags:
  - audio
  - parkinsons
  - speech
  - health
  - classification
  - contrastive-learning
  - attention
  - adain
  - wavelets
  - self-supervised
datasets:
  - ewa-db
  - pc-gita
model-index:
  - name: BDHPD
    results:
      - task:
          type: audio-classification
          name: Parkinson's Disease Detection
        dataset:
          name: EWA-DB (Slovak)
          type: ewa-db
        metrics:
          - type: f1
            value: 69.03
          - type: accuracy
            value: 84.72
          - type: sensitivity
            value: 56.52
          - type: specificity
            value: 88.56
      - task:
          type: audio-classification
          name: Parkinson's Disease Detection
        dataset:
          name: PC-GITA (Spanish)
          type: pc-gita
        metrics:
          - type: f1
            value: 90.83
          - type: accuracy
            value: 90.83
          - type: sensitivity
            value: 93.33
          - type: specificity
            value: 88.33

BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech

This model implements BDHPD, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets.

Model Description

BDHPD combines several modern audio processing techniques:

Self-supervised learning (SSL) with models like microsoft/wavlm-base
Wavelet-based spectrogram features
Adaptive Instance Normalization (AdaIN) for domain adaptation
Convolutional Bottleneck Layers for feature recalibration
Dual-head classification architecture to handle different speech types (e.g., diadochokinetic and continuous)
Contrastive learning for embedding space refinement
Attention pooling for better sequence summarization

The architecture supports bilingual inputs and has been evaluated on EWA-DB (Slovak) and PC-GITA (Spanish).

Intended Use

Research in pathological speech detection
Benchmarking bilingual speech-based PD detection models
Development of real-world diagnostic support tools in healthcare

Training

Training was performed using:

AdamW optimizer
Linear learning rate scheduling with warmup
Binary cross-entropy loss for classification
Contrastive loss via pytorch-metric-learning
20 epochs with early stopping
Balanced batch sampling for both datasets

How to Use

You can find all information on the GitHub repository: BDHPD GitHub

Datasets

EWA-DB: Slovak pathological and healthy speech
PC-GITA: Spanish pathological speech

Limitations

The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested.
Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation.
Should not be used as a standalone diagnostic tool.

Citation

If you use this model or find useful this research work, please cite the following paper:

@inproceedings{laquatra2025bilingual,
  title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech},
  author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato},
  booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2025}
}