|
--- |
|
title: UΒ²-Net MVTec LOCO Foreground Segmentation |
|
tags: |
|
- computer-vision |
|
- image-segmentation |
|
- anomaly-detection |
|
- u2net |
|
- mvtec-loco |
|
- pytorch |
|
license: apache-2.0 |
|
language: en |
|
library_name: pytorch |
|
--- |
|
|
|
# UΒ²-Net MVTec LOCO Foreground Segmentation |
|
|
|
This repository contains a complete tool for generating binary foreground masks from the MVTec LOCO anomaly detection dataset using UΒ²-Net. |
|
|
|
## π Quick Start |
|
|
|
### Installation |
|
```bash |
|
# Clone from HuggingFace |
|
git clone https://huggingface.co/zhiqing0205/u2net-mvtec-loco-segmentation |
|
cd u2net-mvtec-loco-segmentation |
|
|
|
# Install dependencies |
|
pip install torch torchvision opencv-python scikit-image matplotlib numpy pillow huggingface_hub |
|
|
|
# Run segmentation (model auto-downloads) |
|
python mvtec_loco_fg_segmentation.py |
|
``` |
|
|
|
### Download Options |
|
|
|
**Option 1: Auto-download (Recommended)** |
|
```python |
|
from download_from_hf import download_u2net_model |
|
|
|
# Download model only |
|
download_u2net_model() |
|
|
|
# Download complete repo |
|
from download_from_hf import download_complete_repo |
|
download_complete_repo() |
|
``` |
|
|
|
**Option 2: Manual download** |
|
```bash |
|
python download_from_hf.py --model-only |
|
# or |
|
python download_from_hf.py --complete-repo |
|
``` |
|
|
|
## π Repository Contents |
|
|
|
``` |
|
βββ mvtec_loco_fg_segmentation.py # Main segmentation script |
|
βββ download_from_hf.py # HuggingFace download utility |
|
βββ model/ # U2NET model architecture |
|
βββ data_loader.py # Data loading utilities |
|
βββ saved_models/ |
|
β βββ u2net/ |
|
β βββ u2net.pth # Pre-trained U2NET weights (169MB) |
|
βββ README.md # English documentation |
|
βββ README_CN.md # Chinese documentation |
|
βββ ... |
|
``` |
|
|
|
## π― Features |
|
|
|
- **Complete Dataset Processing**: All MVTec LOCO categories |
|
- **Binary Mask Output**: Standard 0/255 masks in grayscale |
|
- **GPU/CPU Support**: Automatic hardware detection |
|
- **Configurable Parameters**: Threshold, categories, splits |
|
- **Auto-download**: No manual model download needed |
|
|
|
## π» Usage |
|
|
|
### Basic Usage |
|
```bash |
|
python mvtec_loco_fg_segmentation.py |
|
``` |
|
|
|
### Advanced Usage |
|
```bash |
|
# Custom parameters |
|
python mvtec_loco_fg_segmentation.py \ |
|
--threshold 0.3 \ |
|
--categories breakfast_box juice_bottle \ |
|
--splits test |
|
|
|
# Show all options |
|
python mvtec_loco_fg_segmentation.py -h |
|
``` |
|
|
|
## π Model Information |
|
|
|
- **Architecture**: UΒ²-Net (U Square Net) |
|
- **Model Size**: 169MB |
|
- **Input Size**: 320Γ320 (auto-resized) |
|
- **Output**: Binary masks (0/255) |
|
- **Task**: Salient object detection β Foreground segmentation |
|
|
|
## π·οΈ Supported Categories |
|
|
|
- `breakfast_box` |
|
- `screw_bag` |
|
- `juice_bottle` |
|
- `splicing_connectors` |
|
- `pushpins` |
|
|
|
## π Performance |
|
|
|
- **GPU Processing**: ~2-3 seconds per image |
|
- **CPU Processing**: ~10-15 seconds per image |
|
- **Memory Usage**: ~200MB GPU memory per image |
|
- **Total Dataset**: ~5000+ images |
|
|
|
## π Citation |
|
|
|
```bibtex |
|
@InProceedings{Qin_2020_PR, |
|
title = {U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection}, |
|
author = {Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin}, |
|
journal = {Pattern Recognition}, |
|
volume = {106}, |
|
pages = {107404}, |
|
year = {2020} |
|
} |
|
``` |
|
|
|
## π License |
|
|
|
Apache-2.0 License |
|
|
|
## π Links |
|
|
|
- [Original UΒ²-Net Paper](https://arxiv.org/pdf/2005.09007.pdf) |
|
- [MVTec LOCO Dataset](https://www.mvtec.com/company/research/datasets/mvtec-loco) |
|
- [GitHub Repository](https://github.com/NathanUA/U-2-Net) |