metadata

title: U²-Net MVTec LOCO Foreground Segmentation
tags:
  - computer-vision
  - image-segmentation
  - anomaly-detection
  - u2net
  - mvtec-loco
  - pytorch
license: apache-2.0
language: en
library_name: pytorch

U²-Net MVTec LOCO Foreground Segmentation

This repository contains a complete tool for generating binary foreground masks from the MVTec LOCO anomaly detection dataset using U²-Net.

🚀 Quick Start

Installation

# Clone from HuggingFace
git clone https://huggingface.co/zhiqing0205/u2net-mvtec-loco-segmentation
cd u2net-mvtec-loco-segmentation

# Install dependencies  
pip install torch torchvision opencv-python scikit-image matplotlib numpy pillow huggingface_hub

# Run segmentation (model auto-downloads)
python mvtec_loco_fg_segmentation.py

Download Options

Option 1: Auto-download (Recommended)

from download_from_hf import download_u2net_model

# Download model only
download_u2net_model()

# Download complete repo
from download_from_hf import download_complete_repo
download_complete_repo()

Option 2: Manual download

python download_from_hf.py --model-only
# or
python download_from_hf.py --complete-repo

📁 Repository Contents

├── mvtec_loco_fg_segmentation.py  # Main segmentation script
├── download_from_hf.py            # HuggingFace download utility
├── model/                         # U2NET model architecture
├── data_loader.py                 # Data loading utilities
├── saved_models/
│   └── u2net/
│       └── u2net.pth             # Pre-trained U2NET weights (169MB)
├── README.md                      # English documentation
├── README_CN.md                   # Chinese documentation
└── ...

🎯 Features

Complete Dataset Processing: All MVTec LOCO categories
Binary Mask Output: Standard 0/255 masks in grayscale
GPU/CPU Support: Automatic hardware detection
Configurable Parameters: Threshold, categories, splits
Auto-download: No manual model download needed

💻 Usage

Basic Usage

python mvtec_loco_fg_segmentation.py

Advanced Usage

# Custom parameters
python mvtec_loco_fg_segmentation.py \
    --threshold 0.3 \
    --categories breakfast_box juice_bottle \
    --splits test

# Show all options
python mvtec_loco_fg_segmentation.py -h

📊 Model Information

Architecture: U²-Net (U Square Net)
Model Size: 169MB
Input Size: 320×320 (auto-resized)
Output: Binary masks (0/255)
Task: Salient object detection → Foreground segmentation

🏷️ Supported Categories

breakfast_box
screw_bag
juice_bottle
splicing_connectors
pushpins

📈 Performance

GPU Processing: ~2-3 seconds per image
CPU Processing: ~10-15 seconds per image
Memory Usage: ~200MB GPU memory per image
Total Dataset: ~5000+ images

📖 Citation

@InProceedings{Qin_2020_PR,
  title = {U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection},
  author = {Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin},
  journal = {Pattern Recognition},
  volume = {106}, 
  pages = {107404},
  year = {2020}
}

📜 License

Apache-2.0 License

zhiqing0205
/

u2net-mvtec-loco-segmentation