SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language Models Paper • 2412.12693 • Published Dec 17, 2024 • 2
Introducing Visual Perception Token into Multimodal Large Language Model Paper • 2502.17425 • Published 19 days ago • 14
CoT-Valve: Length-Compressible Chain-of-Thought Tuning Paper • 2502.09601 • Published 30 days ago • 14
Investigating Copyright Issues of Diffusion Models under Practical Scenarios Paper • 2311.12803 • Published Sep 15, 2023
Subclass-balancing Contrastive Learning for Long-tailed Recognition Paper • 2306.15925 • Published Jun 28, 2023
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published Nov 20, 2024 • 16
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions Paper • 2303.17597 • Published Mar 30, 2023 • 1
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model Paper • 2304.01116 • Published Apr 3, 2023 • 1
Generative Diffusion Prior for Unified Image Restoration and Enhancement Paper • 2304.01247 • Published Apr 3, 2023
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models Paper • 2306.09347 • Published Jun 15, 2023 • 1
InsActor: Instruction-driven Physics-based Characters Paper • 2312.17135 • Published Dec 28, 2023 • 10
DreamGaussian4D: Generative 4D Gaussian Splatting Paper • 2312.17142 • Published Dec 28, 2023 • 19
HumanLiff: Layer-wise 3D Human Generation with Diffusion Model Paper • 2308.09712 • Published Aug 18, 2023 • 1
Benchmarking and Analyzing Point Cloud Classification under Corruptions Paper • 2202.03377 • Published Feb 7, 2022
LaserMix for Semi-Supervised LiDAR Semantic Segmentation Paper • 2207.00026 • Published Jun 30, 2022 • 1