Model Card for CLIP-based Aesthetic Predictor
A simple MLP intended to run on CLIP embeddings to predict the "aesthetic quality" of an image (how much people like it on average).
Trained by Christoph Schuhmann and adapted to suit the Vision Data Curation project.
For more information see: https://github.com/christophschuhmann/improved-aesthetic-predictor
Model Details
- Model Type: Aesthetic score regression model
- Input: OpenAI CLIP embeddings (vit_l14_pn_quick_gelu_openai-clip)
- Output: A score between 0 and 10, where higher values correspond to more aesthetic images
Original authorship: Adapted from Christoph Schuhmann's MLP Aesthetic Score Predictor
Model Usage
This classifier operates on CLIP image embeddings rather than raw pixels. To run inference with the Birder framework:
# Download the CLIP backbone
python -m birder.tools download-model vit_l14_pn_quick_gelu_openai-clip
# Run prediction on a dataset
python -m birder.scripts.predict \
-n vit_l14_pn_quick_gelu \
-t openai-clip \
--simple-crop \
--gpu \
--parallel \
--batch-size 256 \
--chunk-size 50000 \
--amp \
--amp-dtype bfloat16 \
--save-logits \
--suffix optional-dataset-name \
path/to/dataset
# Can now run the aesthetic predictor on the saved logits
Intended Use
Primary use case: Ranking or filtering images by aesthetic appeal, dataset curation, and training data selection.
Recommended scope: Research, dataset preparation, and large-scale data analysis.
Not intended for: As a measure of artistic merit, cultural value, or taste preferences of specific individuals.
Citation
@misc{christophschuhmann2022improved-aesthetic-predictor,
author = {Christoph Schuhmann},
title = {MLP Aesthetic Score Predictor},
year = {2022},
url = {https://github.com/christophschuhmann/improved-aesthetic-predictor},
note = {Accessed: August 22, 2025},
}
- Downloads last month
- -