Unmasked Teacher: Towards Training-Efficient Video Foundation Models Paper • 2303.16058 • Published Mar 28, 2023
Harvest Video Foundation Models via Efficient Post-Pretraining Paper • 2310.19554 • Published Oct 30, 2023
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark Paper • 2311.17005 • Published Nov 28, 2023 • 2
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation Paper • 2307.06942 • Published Jul 13, 2023 • 23
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer Paper • 2211.09552 • Published Nov 17, 2022
InternVideo: General Video Foundation Models via Generative and Discriminative Learning Paper • 2212.03191 • Published Dec 6, 2022
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models Paper • 2412.04446 • Published Dec 5, 2024
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper • 2506.03126 • Published 11 days ago • 22
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper • 2506.03126 • Published 11 days ago • 22 • 2
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper • 2506.03126 • Published 11 days ago • 22
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper • 2504.01014 • Published Apr 1 • 70
GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers Paper • 2503.19480 • Published Mar 25 • 16
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper • 2412.04432 • Published Dec 5, 2024 • 16