SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories Paper β’ 2503.08625 β’ Published Mar 11 β’ 26
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper β’ 2502.17157 β’ Published Feb 24 β’ 53
MangaNinja: Line Art Colorization with Precise Reference Following Paper β’ 2501.08332 β’ Published Jan 14 β’ 60
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper β’ 2501.04001 β’ Published Jan 7 β’ 46
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis Paper β’ 2412.15214 β’ Published Dec 19, 2024 β’ 15
MagicQuill: An Intelligent Interactive Image Editing System Paper β’ 2411.09703 β’ Published Nov 14, 2024 β’ 76
Running on L4 1.78k 1.78k MagicQuill πͺΆ Edit and enhance images with custom color and edge modifications
MagicQuill: An Intelligent Interactive Image Editing System Paper β’ 2411.09703 β’ Published Nov 14, 2024 β’ 76