Running on Zero 26 26 Chat with Kimi-VL (Image, Agent, Video, PDF) π Chat with Kimi-VL-A3B-Instruct using text, images, and videos
Running on Zero 71 71 Chat with Kimi-VL-A3B-Thinking π€ Interact with a chatbot that understands text and images
WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments Paper β’ 2504.03886 β’ Published 11 days ago β’ 9
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning Paper β’ 2504.06958 β’ Published 6 days ago β’ 9
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion Paper β’ 2504.04010 β’ Published 11 days ago β’ 8
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting Paper β’ 2504.05541 β’ Published 8 days ago β’ 14
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis Paper β’ 2504.04842 β’ Published 9 days ago β’ 29
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography Paper β’ 2504.07083 β’ Published 6 days ago β’ 21
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper β’ 2504.07096 β’ Published 6 days ago β’ 66
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing Paper β’ 2504.02826 β’ Published 12 days ago β’ 67
SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning Paper β’ 2504.00396 β’ Published 15 days ago β’ 4
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration Paper β’ 2504.03536 β’ Published 11 days ago β’ 11