Upload README.md

3cd611d verified about 2 months ago

5.83 kB

	# Magic Crop ✨🪄

	A powerful AI-powered image cropping tool that intelligently crops images to preserve maximum pixels while removing unwanted elements like watermarks. Uses the Florence-2-large vision-language model for accurate object detection and smart cropping decisions.

	## 🌟 Features

	- AI-Powered Detection: Uses Florence-2-large model for accurate object detection
	- Smart Cropping: Automatically determines the best crop direction (above, below, left, or right)
	- Object-Aware Cropping: Optional feature to avoid cropping other important objects
	- Batch Processing: Process multiple images efficiently with configurable batch sizes
	- Recursive Folder Processing: Handle entire directory structures
	- Customizable Prompts: Detect any object type, not just watermarks
	- Crop Threshold Protection: Skip images that would require excessive cropping
	- Error Handling: Robust error handling with optional file organization
	- GPU Acceleration: Automatic CUDA detection for faster processing


	## 📋 Requirements

	- Python 3.7+
	- PyTorch
	- Transformers
	- PIL (Pillow)
	- tqdm
	- Florence-2-large model (local model provided)


	## 🛠️ Installation

	1. Install Python dependencies:
	```bash
	pip install requirements.txt
	```

	2. Download Florence-2-large model:
	- Place the Florence-2-large model files in a `./Florence-2-large/` directory relative to the script (automatically downloaded with huggingface-cli)
	- The script expects local model files (`local_files_only=True`)

	## 🚀 Usage

	### Basic Usage

	```bash
	python crop.py input_image.jpg --prompt "watermark"
	```


	### Process Multiple Images

	```bash
	python crop.py image1.jpg image2.jpg image3.jpg -o output_folder
	```


	### Process Folders Recursively

	```bash
	python crop.py /path/to/images/ -r -o /path/to/output/
	```


	### Advanced Usage with Object-Aware Cropping

	```bash
	python crop.py /path/to/images/ -r -o output/ --object-aware --crop-threshold 15 --prompt "logo"
	```


	## 📝 Command-Line Arguments

	\| Argument \| Type \| Default \| Description \|
	\| :-- \| :-- \| :-- \| :-- \|
	\| `input_paths` \| str+ \| Required \| Paths to input images or folders \|
	\| `-r, --recursive` \| flag \| False \| Process folders recursively \|
	\| `-o, --output_folder` \| str \| Input path \| Output folder path \|
	\| `--bs` \| int \| 1 \| Batch size for processing \|
	\| `--prompt` \| str \| "Watermark" \| Object detection prompt \|
	\| `--object-aware` \| flag \| False \| Enable object-aware cropping \|
	\| `--crop-threshold` \| float \| 20.0 \| Max crop area percentage threshold \|
	\| `--move-skipped` \| flag \| False \| Copy skipped files to '_Skipped_' folder \|
	\| `--move-errored` \| flag \| False \| Copy errored files to '_Errored_' folder \|
	\| `--debug` \| flag \| False \| Enable debug output \|

	## 🎯 How It Works

	### 1. Object Detection

	The script uses Florence-2-large to detect objects matching your prompt in the image.

	### 2. Smart Cropping Logic

	- Corner Distance Priority: Prefers objects closer to image corners
	- Size Consideration: Among corner objects, prefers smaller ones
	- Crop Direction: Calculates crop area for all four directions (above, below, left, right)
	- Maximum Preservation: Chooses the direction that preserves the most pixels


	### 3. Object-Aware Mode

	When enabled, the script:

	- Detects all objects in the image
	- Calculates how many object pixels would be lost with each crop direction
	- Chooses the crop that minimizes object pixel loss


	### 4. Safety Thresholds

	- Skips images where cropping would remove more than the threshold percentage
	- Default threshold: 20% of total image area


	## 📁 Output Structure

	```
	output_folder/
	├── original_image_crop_above.jpg
	├── another_image_crop_left.jpg
	├── _Skipped_/ # (if --move-skipped enabled)
	│ └── skipped_files...
	└── _Errored_/ # (if --move-errored enabled)
	└── errored_files...
	```


	## 💡 Examples

	### Remove Watermarks

	```bash
	python crop.py watermarked_images/ -r -o clean_images/ --prompt "watermark"
	```


	### Remove Logos with Object Protection

	```bash
	python crop.py branded_images/ -r --object-aware --prompt "logo" --crop-threshold 10
	```


	### Process with Error Organization

	```bash
	python crop.py images/ -r --move-skipped --move-errored --debug
	```


	### High-Performance Batch Processing

	```bash
	python crop.py large_dataset/ -r --bs 4 -o processed/ --crop-threshold 25
	```


	## ⚠️ Important Notes

	- Model Requirement: Ensure Florence-2-large model is properly installed in `./Florence-2-large/`
	- Memory Usage: Larger batch sizes require more GPU/CPU memory
	- Quality: Output images are saved as JPEG with 98% quality
	- File Formats: Supports PNG, JPG, JPEG, GIF, BMP input formats
	- GPU Recommended: CUDA-capable GPU significantly speeds up processing


	## 🐛 Troubleshooting

	### Common Issues

	1. Model Loading Error: Ensure Florence-2-large is in the correct directory
	2. CUDA Out of Memory: Reduce batch size (`--bs 1`)
	3. No Objects Detected: Try different prompts or check image quality
	4. Large Crop Areas: Adjust `--crop-threshold` value

	### Debug Mode

	Use `--debug` flag to see:

	- Detailed object detection results
	- Crop calculations
	- Processing decisions


	## 📊 Performance Tips

	- Use GPU for faster processing
	- Increase batch size for bulk processing (if memory allows)
	- Enable `--object-aware` only when necessary (slower but more accurate)
	- Use appropriate crop thresholds to avoid processing unsuitable images


	## 🤝 Contributing

	Feel free to submit issues, feature requests, or pull requests to improve this tool!

	## 📄 License

	This project uses the Florence-2-large model. Please check the model's license terms for usage restrictions.

	<div style="text-align: center">⁂</div>