YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

VibeVoice 1.5B - Intel iGPU Optimized

πŸš€ Microsoft VibeVoice Optimized for Intel iGPU

This is the INT8 quantized version of Microsoft's VibeVoice 1.5B model, optimized for Intel integrated GPUs.

Features

  • Multi-speaker synthesis (up to 4 speakers)
  • 90-minute continuous generation
  • 2-3x faster than CPU
  • 55% smaller than original model
  • Intel iGPU optimized via OpenVINO

Model Details

  • Base Model: microsoft/VibeVoice-1.5B
  • Parameters: 2.7B
  • Quantization: INT8 dynamic
  • Size: ~2.3GB (from 5.4GB)
  • Sample Rate: 24kHz

Usage

import torch
from vibevoice_intel import VibeVoiceIntelOptimized

# Load quantized model
model = VibeVoiceIntelOptimized.from_pretrained(
    "magicunicorn/vibevoice-intel-igpu"
)

# Generate multi-speaker dialogue
script = '''
Speaker 1: Hello, welcome to our podcast!
Speaker 2: Thanks for having me.
'''

audio = model.synthesize(script)

Hardware Requirements

  • Intel Iris Xe, Arc iGPU, or UHD Graphics
  • 8GB+ system RAM
  • OpenVINO runtime

Performance

  • Inference: 2-3x faster than CPU
  • Power: 15W (vs 35W+ CPU)
  • Memory: 4GB peak usage

License

MIT

Citation

Original model: Microsoft VibeVoice Optimization: Magic Unicorn Inc

Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support