StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper • 2411.05738 • Published Nov 8, 2024 • 15
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents Paper • 2410.22476 • Published Oct 29, 2024 • 29
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper • 2410.23218 • Published Oct 30, 2024 • 51
Training-free Regional Prompting for Diffusion Transformers Paper • 2411.02395 • Published Nov 4, 2024 • 26
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion Paper • 2411.04928 • Published Nov 7, 2024 • 58
DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation Paper • 2411.04999 • Published Nov 7, 2024 • 18
GazeGen: Gaze-Driven User Interaction for Visual Content Generation Paper • 2411.04335 • Published Nov 7, 2024 • 15
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details Paper • 2411.03047 • Published Nov 5, 2024 • 9
CLEAR: Character Unlearning in Textual and Visual Modalities Paper • 2410.18057 • Published Oct 23, 2024 • 210
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning Paper • 2501.06590 • Published Jan 11 • 11
Evaluating Sample Utility for Data Selection by Mimicking Model Weights Paper • 2501.06708 • Published Jan 12 • 5
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 295
Cobra: Efficient Line Art COlorization with BRoAder References Paper • 2504.12240 • Published Apr 16 • 28
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8 • 170
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing Paper • 2504.02826 • Published Apr 3 • 69
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published Mar 31 • 77
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published Mar 31 • 62
Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image Paper • 2503.17358 • Published Mar 21 • 6
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation Paper • 2503.19777 • Published Mar 25 • 1
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published about 1 month ago • 18
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence Paper • 2506.15677 • Published 8 days ago • 23
Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset Paper • 2506.18851 • Published 3 days ago • 26
3D Arena: An Open Platform for Generative 3D Evaluation Paper • 2506.18787 • Published 3 days ago • 11
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models Paper • 2506.19851 • Published 2 days ago • 48