This repository contains the MobileCLIP-B checkpoint.
Highlights
Our smallest variant MobileCLIP-S0 obtains similar zero-shot performance as OpenAI's ViT-B/16 model while being 4.8x faster and 2.8x smaller.
MobileCLIP-S2 obtains better avg zero-shot performance than SigLIP's ViT-B/16 model while being 2.3x faster and 2.1x smaller, and trained with 3x less seen samples.
MobileCLIP-B(LT) attains zero-shot ImageNet performance of 77.2% which is significantly better than recent works like DFN and SigLIP with similar architectures or even OpenAI's ViT-L/14@336.
First, download the desired checkpoint visiting one of the links in the table above, then click the Files and versions tab, and download the PyTorch checkpoint.
For programmatic downloading, if you have huggingface_hub installed, you can also run:
huggingface-cli download pcuenq/MobileCLIP-B
Then, install ml-mobileclip by following the instructions in the repo. It uses an API similar to open_clip's.
You can run inference with a code snippet like the following: