u2net-mvtec-loco-segmentation / README_HF.md

zhiqing0205

Add complete U2Net project with HuggingFace preparation

ece7754 6 days ago

3.58 kB

	---
	title: U²-Net MVTec LOCO Foreground Segmentation
	tags:
	- computer-vision
	- image-segmentation
	- anomaly-detection
	- u2net
	- mvtec-loco
	- pytorch
	license: apache-2.0
	language: en
	library_name: pytorch
	---

	# U²-Net MVTec LOCO Foreground Segmentation

	This repository contains a complete tool for generating binary foreground masks from the MVTec LOCO anomaly detection dataset using U²-Net.

	## 🚀 Quick Start

	### Installation
	```bash
	# Clone from HuggingFace
	git clone https://huggingface.co/zhiqing0205/u2net-mvtec-loco-segmentation
	cd u2net-mvtec-loco-segmentation

	# Install dependencies
	pip install torch torchvision opencv-python scikit-image matplotlib numpy pillow huggingface_hub

	# Run segmentation (model auto-downloads)
	python mvtec_loco_fg_segmentation.py
	```

	### Download Options

	Option 1: Auto-download (Recommended)
	```python
	from download_from_hf import download_u2net_model

	# Download model only
	download_u2net_model()

	# Download complete repo
	from download_from_hf import download_complete_repo
	download_complete_repo()
	```

	Option 2: Manual download
	```bash
	python download_from_hf.py --model-only
	# or
	python download_from_hf.py --complete-repo
	```

	## 📁 Repository Contents

	```
	├── mvtec_loco_fg_segmentation.py # Main segmentation script
	├── download_from_hf.py # HuggingFace download utility
	├── model/ # U2NET model architecture
	├── data_loader.py # Data loading utilities
	├── saved_models/
	│ └── u2net/
	│ └── u2net.pth # Pre-trained U2NET weights (169MB)
	├── README.md # English documentation
	├── README_CN.md # Chinese documentation
	└── ...
	```

	## 🎯 Features

	- Complete Dataset Processing: All MVTec LOCO categories
	- Binary Mask Output: Standard 0/255 masks in grayscale
	- GPU/CPU Support: Automatic hardware detection
	- Configurable Parameters: Threshold, categories, splits
	- Auto-download: No manual model download needed

	## 💻 Usage

	### Basic Usage
	```bash
	python mvtec_loco_fg_segmentation.py
	```

	### Advanced Usage
	```bash
	# Custom parameters
	python mvtec_loco_fg_segmentation.py \
	--threshold 0.3 \
	--categories breakfast_box juice_bottle \
	--splits test

	# Show all options
	python mvtec_loco_fg_segmentation.py -h
	```

	## 📊 Model Information

	- Architecture: U²-Net (U Square Net)
	- Model Size: 169MB
	- Input Size: 320×320 (auto-resized)
	- Output: Binary masks (0/255)
	- Task: Salient object detection → Foreground segmentation

	## 🏷️ Supported Categories

	- `breakfast_box`
	- `screw_bag`
	- `juice_bottle`
	- `splicing_connectors`
	- `pushpins`

	## 📈 Performance

	- GPU Processing: ~2-3 seconds per image
	- CPU Processing: ~10-15 seconds per image
	- Memory Usage: ~200MB GPU memory per image
	- Total Dataset: ~5000+ images

	## 📖 Citation

	```bibtex
	@InProceedings{Qin_2020_PR,
	title = {U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection},
	author = {Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin},
	journal = {Pattern Recognition},
	volume = {106},
	pages = {107404},
	year = {2020}
	}
	```

	## 📜 License

	Apache-2.0 License

	## 🔗 Links

	- [Original U²-Net Paper](https://arxiv.org/pdf/2005.09007.pdf)
	- [MVTec LOCO Dataset](https://www.mvtec.com/company/research/datasets/mvtec-loco)
	- [GitHub Repository](https://github.com/NathanUA/U-2-Net)