UFM: A Simple Path towards Unified Dense Correspondence with Flow
Abstract
A Unified Flow & Matching model (UFM) improves dense image correspondence accuracy and speed by using a transformer architecture for unified data training, outperforming specialized methods for both optical flow and wide-baseline scenarios.
Dense image correspondence is central to many applications, such as visual odometry, 3D reconstruction, object association, and re-identification. Historically, dense correspondence has been tackled separately for wide-baseline scenarios and optical flow estimation, despite the common goal of matching content between two images. In this paper, we develop a Unified Flow & Matching model (UFM), which is trained on unified data for pixels that are co-visible in both source and target images. UFM uses a simple, generic transformer architecture that directly regresses the (u,v) flow. It is easier to train and more accurate for large flows compared to the typical coarse-to-fine cost volumes in prior work. UFM is 28% more accurate than state-of-the-art flow methods (Unimatch), while also having 62% less error and 6.7x faster than dense wide-baseline matchers (RoMa). UFM is the first to demonstrate that unified training can outperform specialized approaches across both domains. This result enables fast, general-purpose correspondence and opens new directions for multi-modal, long-range, and real-time correspondence tasks.
Community
UFM is a simple, end-to-end trained transformer model that directly regresses pixel displacement images (flow) & covisibility which can be applied to both optical flow and wide-baseline matching tasks with high accuracy and efficiency.
Project Page: https://uniflowmatch.github.io/
HF Interactive Demo: https://huggingface.co/spaces/infinity1096/UFM
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Self-Supervised Spatial Correspondence Across Modalities (2025)
- JointSplat: Probabilistic Joint Flow-Depth Optimization for Sparse-View Gaussian Splatting (2025)
- Deep Learning Reforms Image Matching: A Survey and Outlook (2025)
- RefPose: Leveraging Reference Geometric Correspondences for Accurate 6D Pose Estimation of Unseen Objects (2025)
- Large-scale visual SLAM for in-the-wild videos (2025)
- Improving Optical Flow and Stereo Depth Estimation by Leveraging Uncertainty-Based Learning Difficulties (2025)
- Semantic Correspondence: Unified Benchmarking and a Strong Baseline (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 1
Collections including this paper 0
No Collection including this paper