This demo showcases a specialized version of V-JEPA 2, fine-tuned for real-time video action recognition! Model optimized specifically for classifying 174 different actions from the Something-Something-V2 dataset. Watch as it instantly understands what's happening in the video! ⚡ See instructions below to get started with your webcam. 🚀
git clone https://huggingface.co/spaces/qubvel-hf/vjepa2-streaming-video-classification
cd vjepa2-streaming-video-classification
pip install -r requirements.txt
gradio app.py