morenolq
/

bdhpd-ewa-pcgita

contrastive-learning

self-supervised

Model card Files Files and versions

bdhpd-ewa-pcgita / README.md

MorenoLaQuatra

Initial commit: upload BDHPD model

bdac85b 2 months ago

|

history blame contribute delete

3.57 kB

	---
	license: apache-2.0
	language:
	- es
	- sk
	tags:
	- audio
	- parkinsons
	- speech
	- health
	- classification
	- contrastive-learning
	- attention
	- adain
	- wavelets
	- self-supervised
	datasets:
	- ewa-db
	- pc-gita
	model-index:
	- name: BDHPD
	results:
	- task:
	type: audio-classification
	name: Parkinson's Disease Detection
	dataset:
	name: EWA-DB (Slovak)
	type: ewa-db
	metrics:
	- type: f1
	value: 69.03
	- type: accuracy
	value: 84.72
	- type: sensitivity
	value: 56.52
	- type: specificity
	value: 88.56
	- task:
	type: audio-classification
	name: Parkinson's Disease Detection
	dataset:
	name: PC-GITA (Spanish)
	type: pc-gita
	metrics:
	- type: f1
	value: 90.83
	- type: accuracy
	value: 90.83
	- type: sensitivity
	value: 93.33
	- type: specificity
	value: 88.33
	---

	# BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech

	This model implements BDHPD, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets.

	## Model Description

	BDHPD combines several modern audio processing techniques:
	- Self-supervised learning (SSL) with models like `microsoft/wavlm-base`
	- Wavelet-based spectrogram features
	- Adaptive Instance Normalization (AdaIN) for domain adaptation
	- Convolutional Bottleneck Layers for feature recalibration
	- Dual-head classification architecture to handle different speech types (e.g., diadochokinetic and continuous)
	- Contrastive learning for embedding space refinement
	- Attention pooling for better sequence summarization

	The architecture supports bilingual inputs and has been evaluated on EWA-DB (Slovak) and PC-GITA (Spanish).

	## Intended Use

	- Research in pathological speech detection
	- Benchmarking bilingual speech-based PD detection models
	- Development of real-world diagnostic support tools in healthcare

	## Training

	Training was performed using:
	- AdamW optimizer
	- Linear learning rate scheduling with warmup
	- Binary cross-entropy loss for classification
	- Contrastive loss via `pytorch-metric-learning`
	- 20 epochs with early stopping
	- Balanced batch sampling for both datasets

	## How to Use

	You can find all information on the GitHub repository: [BDHPD GitHub](https://github.com/MorenoLaQuatra/BDHPD)

	## Datasets

	- [EWA-DB](https://zenodo.org/records/10952480): Slovak pathological and healthy speech
	- [PC-GITA](https://aclanthology.org/L14-1549/): Spanish pathological speech

	## Limitations

	- The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested.
	- Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation.
	- Should not be used as a standalone diagnostic tool.

	## Citation

	If you use this model or find useful this research work, please cite the following paper:

	```bibtex
	@inproceedings{laquatra2025bilingual,
	title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech},
	author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato},
	booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
	year={2025}
	}
	```