arxiv:2504.08685

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Published on Apr 11

· Submitted by

roadjiang on Apr 14

#1 Paper of the day

Upvote

Authors:

Zhijie Lin ,

Zhibei Ma ,

Feng Cheng ,

Ziyan Yang ,

Abstract

This technical report presents a cost-efficient strategy for training a video generation foundation model. We present a mid-sized research model with approximately 7 billion parameters (7B) called Seaweed-7B trained from scratch using 665,000 H100 GPU hours. Despite being trained with moderate computational resources, Seaweed-7B demonstrates highly competitive performance compared to contemporary video generation models of much larger size. Design choices are especially crucial in a resource-constrained setting. This technical report highlights the key design decisions that enhance the performance of the medium-sized diffusion model. Empirically, we make two observations: (1) Seaweed-7B achieves performance comparable to, or even surpasses, larger models trained on substantially greater GPU resources, and (2) our model, which exhibits strong generalization ability, can be effectively adapted across a wide range of downstream applications either by lightweight fine-tuning or continue training. See the project page at https://seaweed.video/

View arXiv page View PDF Project page Add to collection

Community

roadjiang

Paper submitter 1 day ago

Seaweed-7B: A cost-effective video generation foundational model

roadjiang

1 day ago

Check out the demo video:

YouTube: https://www.youtube.com/watch?v=OaPI6K2y3rI
X: https://x.com/CeyuanY/status/1911618555210334350

roadjiang

Paper submitter 1 day ago

This comment has been hidden (marked as Resolved)

eepos

1 day ago

Looks great!

Do you plan on releasing the weights? This would be quite something for local inference on consumer GPU's with the low inference cost.

MichaelBarryUK

1 day ago

I've only read the project page, considering it has the ICL ability to learn from reference images, I imagine it is not advisable to release the weights publicly.