Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation (ICCV 2025)
Luca Barsellotti*β Lorenzo Bianchi*β Nicola Messinaβ Fabio Carraraβ Marcella Corniaβ Lorenzo Baraldiβ Fabrizio Falchiβ Rita Cucchiara
Project Page | Paper | Code

Installation
# Create a new environment with Python 3.10
conda create --name talk2dino python=3.10 -c conda-forge
conda activate talk2dino
# Install compilers for C++/CUDA extensions
conda install -c conda-forge "gxx_linux-64=11.*" "gcc_linux-64=11.*"
# Install CUDA toolkit and cuDNN
conda install -c nvidia/label/cuda-11.7.0 cuda
conda install -c nvidia/label/cuda-11.7.0 cuda-nvcc
conda install -c conda-forge cudnn cudatoolkit=11.7.0
# Install PyTorch 2.1 with CUDA 11.8 support
# Note: This is crucial, as it matches the requirements of mmcv-full 1.7.2
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
# Install other dependencies
pip install -r requirements.txt
pip install -U openmim
mim install mmengine
# Install a compatible version of mmcv-full (1.7.2) for PyTorch 2.1
pip install mmcv-full==1.7.2 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1.0/index.html
# Install mmsegmentation
pip install mmsegmentation==0.30.0
Reference
If you found this code useful, please cite the following paper:
@misc{barsellotti2024talkingdinobridgingselfsupervised,
title={Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation},
author={Luca Barsellotti and Lorenzo Bianchi and Nicola Messina and Fabio Carrara and Marcella Cornia and Lorenzo Baraldi and Fabrizio Falchi and Rita Cucchiara},
year={2024},
eprint={2411.19331},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.19331},
}
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support