Papers
arxiv:2507.05675

MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos

Published on Jul 8
· Submitted by wangrongsheng on Jul 9
Authors:
,
,
,
,
,

Abstract

MedGen, a model trained on the large-scale MedVideoCap-55K dataset, achieves top performance in medical video generation by balancing visual quality and medical accuracy.

AI-generated summary

Recent advances in video generation have shown remarkable progress in open-domain settings, yet medical video generation remains largely underexplored. Medical videos are critical for applications such as clinical training, education, and simulation, requiring not only high visual fidelity but also strict medical accuracy. However, current models often produce unrealistic or erroneous content when applied to medical prompts, largely due to the lack of large-scale, high-quality datasets tailored to the medical domain. To address this gap, we introduce MedVideoCap-55K, the first large-scale, diverse, and caption-rich dataset for medical video generation. It comprises over 55,000 curated clips spanning real-world medical scenarios, providing a strong foundation for training generalist medical video generation models. Built upon this dataset, we develop MedGen, which achieves leading performance among open-source models and rivals commercial systems across multiple benchmarks in both visual quality and medical accuracy. We hope our dataset and model can serve as a valuable resource and help catalyze further research in medical video generation. Our code and data is available at https://github.com/FreedomIntelligence/MedGen

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2507.05675 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2507.05675 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.