arxiv:2506.09278

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Published on Jun 10

· Submitted by

Authors:

Abstract

A Unified Flow & Matching model (UFM) improves dense image correspondence accuracy and speed by using a transformer architecture for unified data training, outperforming specialized methods for both optical flow and wide-baseline scenarios.

AI-generated summary

Dense image correspondence is central to many applications, such as visual odometry, 3D reconstruction, object association, and re-identification. Historically, dense correspondence has been tackled separately for wide-baseline scenarios and optical flow estimation, despite the common goal of matching content between two images. In this paper, we develop a Unified Flow & Matching model (UFM), which is trained on unified data for pixels that are co-visible in both source and target images. UFM uses a simple, generic transformer architecture that directly regresses the (u,v) flow. It is easier to train and more accurate for large flows compared to the typical coarse-to-fine cost volumes in prior work. UFM is 28% more accurate than state-of-the-art flow methods (Unimatch), while also having 62% less error and 6.7x faster than dense wide-baseline matchers (RoMa). UFM is the first to demonstrate that unified training can outperform specialized approaches across both domains. This result enables fast, general-purpose correspondence and opens new directions for multi-modal, long-range, and real-time correspondence tasks.

View arXiv page View PDF Project page GitHub repository Add to collection

Community

NikV09

Paper submitter 1 day ago

UFM is a simple, end-to-end trained transformer model that directly regresses pixel displacement images (flow) & covisibility which can be applied to both optical flow and wide-baseline matching tasks with high accuracy and efficiency.

Project Page: https://uniflowmatch.github.io/
HF Interactive Demo: https://huggingface.co/spaces/infinity1096/UFM