DragAPart: Learning a Part-Level Motion Prior for Articulated Objects
Abstract
We introduce DragAPart, a method that, given an image and a set of drags as input, can generate a new image of the same object in a new state, compatible with the action of the drags. Differently from prior works that focused on repositioning objects, DragAPart predicts part-level interactions, such as opening and closing a drawer. We study this problem as a proxy for learning a generalist motion model, not restricted to a specific kinematic structure or object category. To this end, we start from a pre-trained image generator and fine-tune it on a new synthetic dataset, Drag-a-Move, which we introduce. Combined with a new encoding for the drags and dataset randomization, the new model generalizes well to real images and different categories. Compared to prior motion-controlled generators, we demonstrate much better part-level motion understanding.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling (2024)
- Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos (2024)
- Pix2Gif: Motion-Guided Diffusion for GIF Generation (2024)
- HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data (2024)
- Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper