Introducing our new work: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation 🚀
We tackle the core challenges of Subject-to-Video Generation (S2V) by systematically building the first complete infrastructure—featuring an evaluation benchmark and a million-scale dataset! ✨
🧠 Introducing OpenS2V-Eval—the first fine-grained S2V benchmark, with 180 multi-domain prompts + real/synthetic test pairs. We propose NexusScore, NaturalScore, and GmeScore to precisely quantify model performance across subject consistency, naturalness, and text alignment ✔
📊 Using this framework, we conduct a comprehensive evaluation of 16 leading S2V models, revealing their strengths/weaknesses in complex scenarios!
🔥 OpenS2V-5M dataset now available! A 5.4M 720P HD collection of subject-text-video triplets, enabled by cross-video association segmentation + multi-view synthesis for diverse subjects & high-quality annotations 🚀
All resources open-sourced: Paper, Code, Data, and Evaluation Tools 📄 Let's advance S2V research together! 💡
🔥 New benchmark & dataset for Subject-to-Video generation
OPENS2V-NEXUS by Pekin University ✨ Fine-grained evaluation for subject consistency BestWishYsh/OpenS2V-Eval ✨ 5M-scale dataset: BestWishYsh/OpenS2V-5M ✨ New metrics – automatic scores for identity, realism, and text match
✨Emotion-controlled, high-dynamic avatar videos ✨Multi-character support with separate audio control ✨Works with any style: cartoon, 3D, real face, while keeping identity consistent