lxxiao nielsr HF staff commited on
Commit
3b0d97f
·
verified ·
1 Parent(s): a4cc9b5

Improve model card with metadata and project link (#1)

Browse files

- Improve model card with metadata and project link (e0af9d91a6e6a1d9d4155ac0abe895a639882924)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -1,7 +1,22 @@
1
  ---
2
  license: mit
 
3
  ---
4
- MotionStreamer
5
- arXiv: https://arxiv.org/abs/2503.15451
6
 
7
- This paper addresses the challenge of text-conditioned streaming motion generation, which requires us to predict the next-step human pose based on variable-length historical motions and incoming texts. Existing methods struggle to achieve streaming motion generation, e.g., diffusion models are constrained by pre-defined motion lengths, while GPT-based methods suffer from delayed response and error accumulation problem due to discretized non-causal tokenization. To solve these problems, we propose MotionStreamer, a novel framework that incorporates a continuous causal latent space into a probabilistic autoregressive model. The continuous latents mitigate information loss caused by discretization and effectively reduce error accumulation during long-term autoregressive generation. In addition, by establishing temporal causal dependencies between current and historical motion latents, our model fully utilizes the available information to achieve accurate online motion decoding. Experiments show that our method outperforms existing approaches while offering more applications, including multi-round generation, long-term generation, and dynamic motion composition. Project Page: https://zju3dv.github.io/MotionStreamer/.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ pipeline_tag: text-to-video
4
  ---
 
 
5
 
6
+ # MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
7
+
8
+ This repository contains the MotionStreamer model as presented in [MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space](https://huggingface.co/papers/2503.15451).
9
+
10
+ Project Page: [https://zju3dv.github.io/MotionStreamer/](https://zju3dv.github.io/MotionStreamer/)
11
+
12
+ This paper addresses the challenge of text-conditioned streaming motion generation, which requires predicting the next-step human pose based on variable-length historical motions and incoming texts. Existing methods struggle with this; diffusion models are constrained by pre-defined motion lengths, while GPT-based methods suffer from delayed response and error accumulation due to discretized non-causal tokenization. MotionStreamer incorporates a continuous causal latent space into a probabilistic autoregressive model. The continuous latents mitigate information loss caused by discretization and effectively reduce error accumulation during long-term autoregressive generation. By establishing temporal causal dependencies between current and historical motion latents, the model fully utilizes available information for accurate online motion decoding. Experiments show that this method outperforms existing approaches and offers applications including multi-round generation, long-term generation, and dynamic motion composition.
13
+
14
+
15
+ ```bibtex
16
+ @article{xiao2025motionstreamer,
17
+ title={MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space},
18
+ author={Xiao, Lixing and Lu, Shunlin and Pi, Huaijin and Fan, Ke and Pan, Liang and Zhou, Yueer and Feng, Ziyong and Zhou, Xiaowei and Peng, Sida and Wang, Jingbo},
19
+ journal={arXiv preprint arXiv:2503.15451},
20
+ year={2025}
21
+ }
22
+ ```