Papers
arxiv:2503.07314

Automated Movie Generation via Multi-Agent CoT Planning

Published on Mar 10
· Submitted by weijiawu on Mar 11
Authors:
,

Abstract

Existing long-form video generation frameworks lack automated planning, requiring manual input for storylines, scenes, cinematography, and character interactions, resulting in high costs and inefficiencies. To address these challenges, we present MovieAgent, an automated movie generation via multi-agent Chain of Thought (CoT) planning. MovieAgent offers two key advantages: 1) We firstly explore and define the paradigm of automated movie/long-video generation. Given a script and character bank, our MovieAgent can generates multi-scene, multi-shot long-form videos with a coherent narrative, while ensuring character consistency, synchronized subtitles, and stable audio throughout the film. 2) MovieAgent introduces a hierarchical CoT-based reasoning process to automatically structure scenes, camera settings, and cinematography, significantly reducing human effort. By employing multiple LLM agents to simulate the roles of a director, screenwriter, storyboard artist, and location manager, MovieAgent streamlines the production pipeline. Experiments demonstrate that MovieAgent achieves new state-of-the-art results in script faithfulness, character consistency, and narrative coherence. Our hierarchical framework takes a step forward and provides new insights into fully automated movie generation. The code and project website are available at: https://github.com/showlab/MovieAgent and https://weijiawu.github.io/MovieAgent.

Community

Paper author Paper submitter

MovieAgent: Convert your idea to a Movie!

Fascinating work! I was particularly intrigued by the hierarchical CoT-based approach that MovieAgent employs, clearly addressing many of the limitations present in current video generation frameworks. Automating narrative coherence and character consistency through specialized agents truly resonates with real-world filmmaking processes. I'm excited to see how this evolves, particularly in terms of scaling to more complex narratives and diverse visual styles. Great step toward fully automated film production!

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2503.07314 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2503.07314 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2503.07314 in a Space README.md to link it from this page.

Collections including this paper 6