RelightVid: Temporal-Consistent Diffusion Model for Video Relighting Paper • 2501.16330 • Published Jan 27 • 2
V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results Paper • 2406.11739 • Published Jun 17, 2024
X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models Paper • 2412.01824 • Published Dec 2, 2024 • 66
Bootstrap3D: Improving 3D Content Creation with Synthetic Data Paper • 2406.00093 • Published May 31, 2024 • 1
Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials Paper • 2404.16829 • Published Apr 25, 2024 • 5
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases Paper • 2312.15011 • Published Dec 22, 2023 • 18
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want Paper • 2312.03818 • Published Dec 6, 2023 • 34
GPT4Point: A Unified Framework for Point-Language Understanding and Generation Paper • 2312.02980 • Published Dec 5, 2023 • 10