IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation Paper • 2508.00823 • Published 4 days ago • 4
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation Paper • 2508.00782 • Published 4 days ago • 4
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding Paper • 2507.23478 • Published 5 days ago • 9
ByteMorph Collection Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions • 6 items • Updated Jun 3 • 1
Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published 5 days ago • 37
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper • 2505.00703 • Published May 1 • 45
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 14 items • Updated 15 days ago • 24
Llama3-ChatQA-2 Collection This is the collection that presents ChatQA-2, a suite of 128K long-context models, that also have exceptional RAG capabilities • 3 items • Updated 15 days ago • 4
NeMo Curator - Classifier Models Collection Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated 15 days ago • 20
Cosmos-Predict1 Collection World Foundation Model for Future Prediction • 14 items • Updated 15 days ago • 10
Cosmos-Transfer1 Collection Multimodal Conditional World Generation for World2World Transfer • 6 items • Updated 15 days ago • 23
Cosmos-Reason1 Collection Multimodal world understanding through reasoning • 10 items • Updated 14 days ago • 33
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation Paper • 2504.06803 • Published Apr 9 • 1
Pathways on the Image Manifold: Image Editing via Video Generation Paper • 2411.16819 • Published Nov 25, 2024 • 38
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published 5 days ago • 94
SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration Paper • 2312.04803 • Published Dec 8, 2023 • 1