--- license: mit datasets: - CuriousMonkey7/HumSpeechBlend language: - en base_model: - freddyaboulton/silero-vad pipeline_tag: voice-activity-detection tags: - vad - speech - audio - voice_activity_detection - silero-vad --- # HumAware-VAD: Humming-Aware Voice Activity Detection ## 📌 Overview **HumAware-VAD** is a fine-tuned version of the **[Silero-VAD](https://github.com/snakers4/silero-vad/tree/master)** model, trained to distinguish **humming from actual speech**. Standard Voice Activity Detection (VAD) models, including Silero-VAD, often misclassify humming as speech, leading to inaccurate speech segmentation. HumAware-VAD improves upon this by leveraging a custom dataset (**[HumSpeechBlend](https://huggingface.co/datasets/CuriousMonkey7/HumSpeechBlend)**) to enhance speech detection accuracy in the presence of humming. ## 🎯 Purpose The primary goal of **HumAware-VAD** is to: - Reduce **false positives** where humming is mistakenly detected as speech. - Enhance **speech segmentation accuracy** in real-world applications. - Improve VAD performance for tasks involving **music, background noise, and vocal sounds**. ## 🗂️ Model Details - **Base Model**: [Silero-VAD](https://github.com/snakers4/silero-vad/tree/master) - **Fine-tuning Dataset**: [HumSpeechBlend](https://huggingface.co/datasets/CuriousMonkey7/HumSpeechBlend) - **Format**: JIT (TorchScript) - **Framework**: PyTorch - **Inference Speed**: Real-time ## 📥 Download & Usage ### 🔹 Install Dependencies ```bash pip install torch torchaudio ``` ### 🔹 Load the Model ```python import torch def load_humaware_vad(model_path="humaware_vad.jit"): model = torch.jit.load(model_path) model.eval() return model vad_model = load_humaware_vad() ``` ### 🔹 Run Inference ```python import torchaudio waveform, sample_rate = torchaudio.load("data/0000.wav") out = vad_model(waveform) print("VAD Output:", out) ``` ## 📄 Citation If you use this model, please cite it accordingly. ``` @model{HumAwareVAD2025, author = {Sourabh Saini}, title = {HumAware-VAD: Humming-Aware Voice Activity Detection}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/CuriousMonkey7/HumAware-VAD} } ```