arxiv:2507.03683

On the rankability of visual embeddings

Published on Jul 4

· Submitted by

Gigglingface on Jul 8

Upvote

Authors:

Arnas Uselis ,

Abstract

Visual embedding models often capture continuous, ordinal attributes along specific axes, enabling effective image ranking with minimal supervision.

AI-generated summary

We study whether visual embedding models capture continuous, ordinal attributes along linear directions, which we term _rank axes_. We define a model as _rankable_ for an attribute if projecting embeddings onto such an axis preserves the attribute's order. Across 7 popular encoders and 9 datasets with attributes like age, crowd count, head pose, aesthetics, and recency, we find that many embeddings are inherently rankable. Surprisingly, a small number of samples, or even just two extreme examples, often suffice to recover meaningful rank axes, without full-scale supervision. These findings open up new use cases for image ranking in vector databases and motivate further study into the structure and learning of rankable embeddings. Our code is available at https://github.com/aktsonthalia/rankable-vision-embeddings.

View arXiv page View PDF GitHub 4 Add to collection

Community

Gigglingface

Paper author Paper submitter 1 day ago

Our work examines the rankability of features by continuous attributes in models like CLIP and DINO, showing that ordinal attributes often lie along a single ranking axis in the embedding space!