Magic Crop ✨🪄

A powerful AI-powered image cropping tool that intelligently crops images to preserve maximum pixels while removing unwanted elements like watermarks. Uses the Florence-2-large vision-language model for accurate object detection and smart cropping decisions.

🌟 Features

AI-Powered Detection: Uses Florence-2-large model for accurate object detection
Smart Cropping: Automatically determines the best crop direction (above, below, left, or right)
Object-Aware Cropping: Optional feature to avoid cropping other important objects
Batch Processing: Process multiple images efficiently with configurable batch sizes
Recursive Folder Processing: Handle entire directory structures
Customizable Prompts: Detect any object type, not just watermarks
Crop Threshold Protection: Skip images that would require excessive cropping
Error Handling: Robust error handling with optional file organization
GPU Acceleration: Automatic CUDA detection for faster processing

📋 Requirements

Python 3.7+
PyTorch
Transformers
PIL (Pillow)
tqdm
Florence-2-large model (local model provided)

🛠️ Installation

Install Python dependencies:

pip install requirements.txt

Download Florence-2-large model:
- Place the Florence-2-large model files in a ./Florence-2-large/ directory relative to the script (automatically downloaded with huggingface-cli)
- The script expects local model files (local_files_only=True)

🚀 Usage

Basic Usage

python crop.py input_image.jpg --prompt "watermark"

Process Multiple Images

python crop.py image1.jpg image2.jpg image3.jpg -o output_folder

Process Folders Recursively

python crop.py /path/to/images/ -r -o /path/to/output/

Advanced Usage with Object-Aware Cropping

python crop.py /path/to/images/ -r -o output/ --object-aware --crop-threshold 15 --prompt "logo"

📝 Command-Line Arguments

Argument	Type	Default	Description
`input_paths`	str+	Required	Paths to input images or folders
`-r, --recursive`	flag	False	Process folders recursively
`-o, --output_folder`	str	Input path	Output folder path
`--bs`	int	1	Batch size for processing
`--prompt`	str	"Watermark"	Object detection prompt
`--object-aware`	flag	False	Enable object-aware cropping
`--crop-threshold`	float	20.0	Max crop area percentage threshold
`--move-skipped`	flag	False	Copy skipped files to 'Skipped' folder
`--move-errored`	flag	False	Copy errored files to 'Errored' folder
`--debug`	flag	False	Enable debug output

🎯 How It Works

1. Object Detection

The script uses Florence-2-large to detect objects matching your prompt in the image.

2. Smart Cropping Logic

Corner Distance Priority: Prefers objects closer to image corners
Size Consideration: Among corner objects, prefers smaller ones
Crop Direction: Calculates crop area for all four directions (above, below, left, right)
Maximum Preservation: Chooses the direction that preserves the most pixels

3. Object-Aware Mode

When enabled, the script:

Detects all objects in the image
Calculates how many object pixels would be lost with each crop direction
Chooses the crop that minimizes object pixel loss

4. Safety Thresholds

Skips images where cropping would remove more than the threshold percentage
Default threshold: 20% of total image area

📁 Output Structure

output_folder/
├── original_image_crop_above.jpg
├── another_image_crop_left.jpg
├── _Skipped_/           # (if --move-skipped enabled)
│   └── skipped_files...
└── _Errored_/           # (if --move-errored enabled)
    └── errored_files...

💡 Examples

Remove Watermarks

python crop.py watermarked_images/ -r -o clean_images/ --prompt "watermark"

Remove Logos with Object Protection

python crop.py branded_images/ -r --object-aware --prompt "logo" --crop-threshold 10

Process with Error Organization

python crop.py images/ -r --move-skipped --move-errored --debug

High-Performance Batch Processing

python crop.py large_dataset/ -r --bs 4 -o processed/ --crop-threshold 25

⚠️ Important Notes

Model Requirement: Ensure Florence-2-large model is properly installed in ./Florence-2-large/
Memory Usage: Larger batch sizes require more GPU/CPU memory
Quality: Output images are saved as JPEG with 98% quality
File Formats: Supports PNG, JPG, JPEG, GIF, BMP input formats
GPU Recommended: CUDA-capable GPU significantly speeds up processing

🐛 Troubleshooting

Common Issues

Model Loading Error: Ensure Florence-2-large is in the correct directory
CUDA Out of Memory: Reduce batch size (--bs 1)
No Objects Detected: Try different prompts or check image quality
Large Crop Areas: Adjust --crop-threshold value

Debug Mode

Use --debug flag to see:

Detailed object detection results
Crop calculations
Processing decisions

📊 Performance Tips

Use GPU for faster processing
Increase batch size for bulk processing (if memory allows)
Enable --object-aware only when necessary (slower but more accurate)
Use appropriate crop thresholds to avoid processing unsuitable images

🤝 Contributing

Feel free to submit issues, feature requests, or pull requests to improve this tool!

📄 License

This project uses the Florence-2-large model. Please check the model's license terms for usage restrictions.

⁂