File size: 5,834 Bytes
3cd611d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 |
# Magic Crop β¨πͺ
A powerful AI-powered image cropping tool that intelligently crops images to preserve maximum pixels while removing unwanted elements like watermarks. Uses the Florence-2-large vision-language model for accurate object detection and smart cropping decisions.
## π Features
- **AI-Powered Detection**: Uses Florence-2-large model for accurate object detection
- **Smart Cropping**: Automatically determines the best crop direction (above, below, left, or right)
- **Object-Aware Cropping**: Optional feature to avoid cropping other important objects
- **Batch Processing**: Process multiple images efficiently with configurable batch sizes
- **Recursive Folder Processing**: Handle entire directory structures
- **Customizable Prompts**: Detect any object type, not just watermarks
- **Crop Threshold Protection**: Skip images that would require excessive cropping
- **Error Handling**: Robust error handling with optional file organization
- **GPU Acceleration**: Automatic CUDA detection for faster processing
## π Requirements
- Python 3.7+
- PyTorch
- Transformers
- PIL (Pillow)
- tqdm
- Florence-2-large model (local model provided)
## π οΈ Installation
1. **Install Python dependencies**:
```bash
pip install requirements.txt
```
2. **Download Florence-2-large model**:
- Place the Florence-2-large model files in a `./Florence-2-large/` directory relative to the script (automatically downloaded with huggingface-cli)
- The script expects local model files (`local_files_only=True`)
## π Usage
### Basic Usage
```bash
python crop.py input_image.jpg --prompt "watermark"
```
### Process Multiple Images
```bash
python crop.py image1.jpg image2.jpg image3.jpg -o output_folder
```
### Process Folders Recursively
```bash
python crop.py /path/to/images/ -r -o /path/to/output/
```
### Advanced Usage with Object-Aware Cropping
```bash
python crop.py /path/to/images/ -r -o output/ --object-aware --crop-threshold 15 --prompt "logo"
```
## π Command-Line Arguments
| Argument | Type | Default | Description |
| :-- | :-- | :-- | :-- |
| `input_paths` | str+ | Required | Paths to input images or folders |
| `-r, --recursive` | flag | False | Process folders recursively |
| `-o, --output_folder` | str | Input path | Output folder path |
| `--bs` | int | 1 | Batch size for processing |
| `--prompt` | str | "Watermark" | Object detection prompt |
| `--object-aware` | flag | False | Enable object-aware cropping |
| `--crop-threshold` | float | 20.0 | Max crop area percentage threshold |
| `--move-skipped` | flag | False | Copy skipped files to '_Skipped_' folder |
| `--move-errored` | flag | False | Copy errored files to '_Errored_' folder |
| `--debug` | flag | False | Enable debug output |
## π― How It Works
### 1. Object Detection
The script uses Florence-2-large to detect objects matching your prompt in the image.
### 2. Smart Cropping Logic
- **Corner Distance Priority**: Prefers objects closer to image corners
- **Size Consideration**: Among corner objects, prefers smaller ones
- **Crop Direction**: Calculates crop area for all four directions (above, below, left, right)
- **Maximum Preservation**: Chooses the direction that preserves the most pixels
### 3. Object-Aware Mode
When enabled, the script:
- Detects all objects in the image
- Calculates how many object pixels would be lost with each crop direction
- Chooses the crop that minimizes object pixel loss
### 4. Safety Thresholds
- Skips images where cropping would remove more than the threshold percentage
- Default threshold: 20% of total image area
## π Output Structure
```
output_folder/
βββ original_image_crop_above.jpg
βββ another_image_crop_left.jpg
βββ _Skipped_/ # (if --move-skipped enabled)
β βββ skipped_files...
βββ _Errored_/ # (if --move-errored enabled)
βββ errored_files...
```
## π‘ Examples
### Remove Watermarks
```bash
python crop.py watermarked_images/ -r -o clean_images/ --prompt "watermark"
```
### Remove Logos with Object Protection
```bash
python crop.py branded_images/ -r --object-aware --prompt "logo" --crop-threshold 10
```
### Process with Error Organization
```bash
python crop.py images/ -r --move-skipped --move-errored --debug
```
### High-Performance Batch Processing
```bash
python crop.py large_dataset/ -r --bs 4 -o processed/ --crop-threshold 25
```
## β οΈ Important Notes
- **Model Requirement**: Ensure Florence-2-large model is properly installed in `./Florence-2-large/`
- **Memory Usage**: Larger batch sizes require more GPU/CPU memory
- **Quality**: Output images are saved as JPEG with 98% quality
- **File Formats**: Supports PNG, JPG, JPEG, GIF, BMP input formats
- **GPU Recommended**: CUDA-capable GPU significantly speeds up processing
## π Troubleshooting
### Common Issues
1. **Model Loading Error**: Ensure Florence-2-large is in the correct directory
2. **CUDA Out of Memory**: Reduce batch size (`--bs 1`)
3. **No Objects Detected**: Try different prompts or check image quality
4. **Large Crop Areas**: Adjust `--crop-threshold` value
### Debug Mode
Use `--debug` flag to see:
- Detailed object detection results
- Crop calculations
- Processing decisions
## π Performance Tips
- Use GPU for faster processing
- Increase batch size for bulk processing (if memory allows)
- Enable `--object-aware` only when necessary (slower but more accurate)
- Use appropriate crop thresholds to avoid processing unsuitable images
## π€ Contributing
Feel free to submit issues, feature requests, or pull requests to improve this tool!
## π License
This project uses the Florence-2-large model. Please check the model's license terms for usage restrictions.
<div style="text-align: center">β</div> |