Wan: Open and Advanced Large-Scale Video Generative Models
Select coordinates on an image based on instructions
Generate text by combining an image and a question
Upgraded to v1.0!