avinashHuggingface108's picture
Update deployment to use SmolVLM2-256M-Video-Instruct model
7ef6739
metadata
title: SmolVLM2 Video Highlights
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860

🎬 SmolVLM2 HuggingFace Segment-Based Video Highlights API

Generate intelligent video highlights using HuggingFace's segment-based approach

This is a FastAPI service that uses HuggingFace's proven segment-based classification method with SmolVLM2-256M-Video-Instruct for reliable, consistent highlight generation.

πŸš€ Features

  • Segment-Based Analysis: Processes videos in fixed 5-second segments for consistent AI classification
  • Dual Criteria Generation: Creates two different highlight criteria sets and selects the most selective one
  • SmolVLM2-256M-Video-Instruct: Faster processing with specialized video understanding
  • Visual Effects: Optional fade transitions between segments for professional-quality output
  • REST API: Upload videos and download processed highlights with job tracking
  • Background Processing: Non-blocking video processing with real-time status updates

πŸ”— API Endpoints

  • POST /upload-video - Upload video for processing
  • GET /job-status/{job_id} - Check processing status
  • GET /download/{filename} - Download generated highlights
  • GET /docs - Interactive API documentation

πŸ“± Usage

Via API

# Upload video with optional parameters
curl -X POST \
  -F "video=@your_video.mp4" \
  -F "segment_length=5.0" \
  -F "model_name=HuggingFaceTB/SmolVLM2-256M-Video-Instruct" \
  -F "with_effects=true" \
  https://your-space-url.hf.space/upload-video

# Check processing status
curl https://your-space-url.hf.space/job-status/YOUR_JOB_ID

# Download highlights and analysis
curl -O https://your-space-url.hf.space/download/HIGHLIGHTS.mp4
curl -O https://your-space-url.hf.space/download/ANALYSIS.json

Via Android App

Use the provided Android client code to integrate with your mobile app.

βš™οΈ Configuration

Default settings:

  • Segment Length: 5 seconds (fixed segments for consistent classification)
  • Model: SmolVLM2-256M-Video-Instruct (faster processing)
  • Effects: Enabled (fade transitions between segments)
  • Dual Criteria: Two prompt variations for robust selection

πŸ› οΈ Technology Stack

  • SmolVLM2-256M-Video-Instruct: Efficient vision-language model optimized for video understanding
  • HuggingFace Transformers: Latest transformer models and inference
  • FastAPI: Modern web framework for APIs
  • FFmpeg: Video processing with advanced filter support
  • PyTorch: Deep learning framework with device optimization

🎯 Perfect For

  • Social media content creators
  • Educational video processing
  • Meeting/lecture summarization
  • Sports highlight generation
  • Entertainment content curation

οΏ½οΏ½ License

Apache 2.0 - Free for commercial and personal use

🀝 Contributing

Built with ❀️ using Hugging Face Transformers and open-source AI models.