zhiqing0205
Add complete U2Net project with HuggingFace preparation
ece7754
---
title: UΒ²-Net MVTec LOCO Foreground Segmentation
tags:
- computer-vision
- image-segmentation
- anomaly-detection
- u2net
- mvtec-loco
- pytorch
license: apache-2.0
language: en
library_name: pytorch
---
# UΒ²-Net MVTec LOCO Foreground Segmentation
This repository contains a complete tool for generating binary foreground masks from the MVTec LOCO anomaly detection dataset using UΒ²-Net.
## πŸš€ Quick Start
### Installation
```bash
# Clone from HuggingFace
git clone https://huggingface.co/zhiqing0205/u2net-mvtec-loco-segmentation
cd u2net-mvtec-loco-segmentation
# Install dependencies
pip install torch torchvision opencv-python scikit-image matplotlib numpy pillow huggingface_hub
# Run segmentation (model auto-downloads)
python mvtec_loco_fg_segmentation.py
```
### Download Options
**Option 1: Auto-download (Recommended)**
```python
from download_from_hf import download_u2net_model
# Download model only
download_u2net_model()
# Download complete repo
from download_from_hf import download_complete_repo
download_complete_repo()
```
**Option 2: Manual download**
```bash
python download_from_hf.py --model-only
# or
python download_from_hf.py --complete-repo
```
## πŸ“ Repository Contents
```
β”œβ”€β”€ mvtec_loco_fg_segmentation.py # Main segmentation script
β”œβ”€β”€ download_from_hf.py # HuggingFace download utility
β”œβ”€β”€ model/ # U2NET model architecture
β”œβ”€β”€ data_loader.py # Data loading utilities
β”œβ”€β”€ saved_models/
β”‚ └── u2net/
β”‚ └── u2net.pth # Pre-trained U2NET weights (169MB)
β”œβ”€β”€ README.md # English documentation
β”œβ”€β”€ README_CN.md # Chinese documentation
└── ...
```
## 🎯 Features
- **Complete Dataset Processing**: All MVTec LOCO categories
- **Binary Mask Output**: Standard 0/255 masks in grayscale
- **GPU/CPU Support**: Automatic hardware detection
- **Configurable Parameters**: Threshold, categories, splits
- **Auto-download**: No manual model download needed
## πŸ’» Usage
### Basic Usage
```bash
python mvtec_loco_fg_segmentation.py
```
### Advanced Usage
```bash
# Custom parameters
python mvtec_loco_fg_segmentation.py \
--threshold 0.3 \
--categories breakfast_box juice_bottle \
--splits test
# Show all options
python mvtec_loco_fg_segmentation.py -h
```
## πŸ“Š Model Information
- **Architecture**: UΒ²-Net (U Square Net)
- **Model Size**: 169MB
- **Input Size**: 320Γ—320 (auto-resized)
- **Output**: Binary masks (0/255)
- **Task**: Salient object detection β†’ Foreground segmentation
## 🏷️ Supported Categories
- `breakfast_box`
- `screw_bag`
- `juice_bottle`
- `splicing_connectors`
- `pushpins`
## πŸ“ˆ Performance
- **GPU Processing**: ~2-3 seconds per image
- **CPU Processing**: ~10-15 seconds per image
- **Memory Usage**: ~200MB GPU memory per image
- **Total Dataset**: ~5000+ images
## πŸ“– Citation
```bibtex
@InProceedings{Qin_2020_PR,
title = {U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection},
author = {Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin},
journal = {Pattern Recognition},
volume = {106},
pages = {107404},
year = {2020}
}
```
## πŸ“œ License
Apache-2.0 License
## πŸ”— Links
- [Original UΒ²-Net Paper](https://arxiv.org/pdf/2005.09007.pdf)
- [MVTec LOCO Dataset](https://www.mvtec.com/company/research/datasets/mvtec-loco)
- [GitHub Repository](https://github.com/NathanUA/U-2-Net)