metadata
title: UΒ²-Net MVTec LOCO Foreground Segmentation
tags:
- computer-vision
- image-segmentation
- anomaly-detection
- u2net
- mvtec-loco
- pytorch
license: apache-2.0
language: en
library_name: pytorch
UΒ²-Net MVTec LOCO Foreground Segmentation
This repository contains a complete tool for generating binary foreground masks from the MVTec LOCO anomaly detection dataset using UΒ²-Net.
π Quick Start
Installation
# Clone from HuggingFace
git clone https://huggingface.co/zhiqing0205/u2net-mvtec-loco-segmentation
cd u2net-mvtec-loco-segmentation
# Install dependencies
pip install torch torchvision opencv-python scikit-image matplotlib numpy pillow huggingface_hub
# Run segmentation (model auto-downloads)
python mvtec_loco_fg_segmentation.py
Download Options
Option 1: Auto-download (Recommended)
from download_from_hf import download_u2net_model
# Download model only
download_u2net_model()
# Download complete repo
from download_from_hf import download_complete_repo
download_complete_repo()
Option 2: Manual download
python download_from_hf.py --model-only
# or
python download_from_hf.py --complete-repo
π Repository Contents
βββ mvtec_loco_fg_segmentation.py # Main segmentation script
βββ download_from_hf.py # HuggingFace download utility
βββ model/ # U2NET model architecture
βββ data_loader.py # Data loading utilities
βββ saved_models/
β βββ u2net/
β βββ u2net.pth # Pre-trained U2NET weights (169MB)
βββ README.md # English documentation
βββ README_CN.md # Chinese documentation
βββ ...
π― Features
- Complete Dataset Processing: All MVTec LOCO categories
- Binary Mask Output: Standard 0/255 masks in grayscale
- GPU/CPU Support: Automatic hardware detection
- Configurable Parameters: Threshold, categories, splits
- Auto-download: No manual model download needed
π» Usage
Basic Usage
python mvtec_loco_fg_segmentation.py
Advanced Usage
# Custom parameters
python mvtec_loco_fg_segmentation.py \
--threshold 0.3 \
--categories breakfast_box juice_bottle \
--splits test
# Show all options
python mvtec_loco_fg_segmentation.py -h
π Model Information
- Architecture: UΒ²-Net (U Square Net)
- Model Size: 169MB
- Input Size: 320Γ320 (auto-resized)
- Output: Binary masks (0/255)
- Task: Salient object detection β Foreground segmentation
π·οΈ Supported Categories
breakfast_box
screw_bag
juice_bottle
splicing_connectors
pushpins
π Performance
- GPU Processing: ~2-3 seconds per image
- CPU Processing: ~10-15 seconds per image
- Memory Usage: ~200MB GPU memory per image
- Total Dataset: ~5000+ images
π Citation
@InProceedings{Qin_2020_PR,
title = {U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection},
author = {Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin},
journal = {Pattern Recognition},
volume = {106},
pages = {107404},
year = {2020}
}
π License
Apache-2.0 License