Bidirectional Normalizing Flow: From Data to Noise and Back Paper • 2512.10953 • Published 23 days ago • 5
Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation Paper • 2504.16060 • Published Apr 22, 2025
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time Paper • 2506.18890 • Published Jun 23, 2025 • 6
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 16 days ago • 82
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published 16 days ago • 33
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 16 days ago • 82