Spaces:
Runtime error
Runtime error
# Depth Anything for Semantic Segmentation | |
We use our Depth Anything pre-trained ViT-L encoder to fine-tune downstream semantic segmentation models. | |
## Performance | |
### Cityscapes | |
Note that our results are obtained *without* Mapillary pre-training. | |
| Method | Encoder | mIoU (s.s.) | m.s. | | |
|:-:|:-:|:-:|:-:| | |
| SegFormer | MiT-B5 | 82.4 | 84.0 | | |
| Mask2Former | Swin-L | 83.3 | 84.3 | | |
| OneFormer | Swin-L | 83.0 | 84.4 | | |
| OneFormer | ConNeXt-XL | 83.6 | 84.6 | | |
| DDP | ConNeXt-L | 83.2 | 83.9 | | |
| **Ours** | ViT-L | **84.8** | **86.2** | | |
### ADE20K | |
| Method | Encoder | mIoU | | |
|:-:|:-:|:-:| | |
| SegFormer | MiT-B5 | 51.0 | | |
| Mask2Former | Swin-L | 56.4 | | |
| UperNet | BEiT-L | 56.3 | | |
| ViT-Adapter | BEiT-L | 58.3 | | |
| OneFormer | Swin-L | 57.4 | | |
| OneFormer | ConNeXt-XL | 57.4 | | |
| **Ours** | ViT-L | **59.4** | | |
## Pre-trained models | |
- [Cityscapes-ViT-L-mIoU-86.4](https://huggingface.co/spaces/LiheYoung/Depth-Anything/blob/main/checkpoints_semseg/cityscapes_vitl_mIoU_86.4.pth) | |
- [ADE20K-ViT-L-mIoU-59.4](https://huggingface.co/spaces/LiheYoung/Depth-Anything/blob/main/checkpoints_semseg/ade20k_vitl_mIoU_59.4.pth) | |
## Installation | |
Please refer to [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/en/get_started.md#installation) for instructions. *Do not forget to install ``mmdet`` to support ``Mask2Former``:* | |
```bash | |
pip install "mmdet>=3.0.0rc4" | |
``` | |
After installation: | |
- move our [config/depth_anything](./config/depth_anything/) to mmseg's [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs) | |
- move our [dinov2.py](./dinov2.py) to mmseg's [backbones](https://github.com/open-mmlab/mmsegmentation/tree/main/mmseg/models/backbones) | |
- add DINOv2 in mmseg's [models/backbones/\_\_init\_\_.py](https://github.com/open-mmlab/mmsegmentation/blob/main/mmseg/models/backbones/__init__.py) | |
- download our provided [torchhub](https://github.com/LiheYoung/Depth-Anything/tree/main/torchhub) directory and put it at the root of your working directory | |
- download the [Depth Anything pre-trained model](https://huggingface.co/spaces/LiheYoung/Depth-Anything/blob/main/checkpoints/depth_anything_vitl14.pth) (to initialize the encoder) and 2) put it under the ``checkpoints`` folder. | |
For training or inference with our pre-trained models, please refer to MMSegmentation [instructions](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/en/user_guides/4_train_test.md). | |