ControlNet for Manga Colorization
Model Name: SubMaroon/ControlNet-manga-recolor
Base model: John6666/nsfw-anime-xl-v1-sdxl
Task: Conditional image generation — Colorization
Conditioning: Grayscale manga panel (lineart or filled)
Trained with: Hugging Face diffusers ControlNet training pipeline
Description
This is a custom-trained ControlNet model designed to perform automatic colorization of grayscale anime styled images.
The model takes in a black-and-white anime styled pictures (converted to RGB) as conditioning input and generates a colorized version using Stable Diffusion.
It is trained to act as a ControlNet module and requires a compatible SDXL base model — such as nsfw-anime-xl-v1-sdxl or other anime/manga-focused SDXL models.
Training details
- Base model:
John6666/nsfw-anime-xl-v1-sdxl - Dataset: Custom dataset of ~6,000 image pairs from Danbooru-based manga scans, manually cleaned and resized to
768x768 - Inputs:
conditioning_image: black-and-white manga scan (RGB)text prompt: optional (e.g. "1girl, blue_eyes, blue_hair etc.")
- Loss: MSE with FP16, trained on 1×RTX3090, 4 epochs
- Resolution: 768x768
- Scheduler: default diffusers setup
- Optimizer: LR:
1.4e-4
Usage (Diffusers)
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from diffusers.utils import load_image
import torch
# Load ControlNet
controlnet = ControlNetModel.from_pretrained("SubMaroon/ControlNet-manga-recolor", torch_dtype=torch.float16)
# Load base pipeline
pipe = StableDiffusionControlNetPipeline.from_pretrained(
"John6666/nsfw-anime-xl-v1-sdxl", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.to("cuda")
# Load grayscale manga panel
conditioning_image = load_image("bw_manga_panel.png").convert("RGB")
# Generate
image = pipe("manga colorization", image=conditioning_image, num_inference_steps=30).images[0]
image.save("colorized.png")
Usage in ComfyUI / WebUI
- Place
diffusion_pytorch_model.safetensorsinto yourComfyUI/models/controlnet/folder - Make sure to also include the
config.json - Select this ControlNet in your workflow
- Use grayscale images as conditioning inputs
Alternative training run (SDXL version)
This version was trained using the SDXL-compatible ControlNet pipeline with the following CLI command:
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path="John6666/nsfw-anime-xl-v1-sdxl" \
--dataset_name="SubMaroon/danbooru-colored" \
--image_column="image" \
--conditioning_image_column="conditioning_image" \
--caption_column="prompt" \
--output_dir="./controlnet-colorization" \
--resolution=768 \
--train_batch_size=4 \
--gradient_accumulation_steps=4 \
--learning_rate=1.4e-4 \
--num_train_epochs=12 \
--mixed_precision="fp16" \
--gradient_checkpointing \
--checkpointing_steps=1000 \
--validation_steps=1000 \
--report_to="tensorboard" \
--tracker_project_name="controlnet-colorization" \
--seed=42
License
The model is released under the CreativeML Open RAIL-M license.
You are free to use it for non-commercial and research purposes. Commercial use may require additional permission.
Credits
Created by SubMaroon
Trained with compute generously provided by Flanayt Pulsar
Based on the Hugging Face diffusers ControlNet training example
Inspired by lllyasviel's original ControlNet
- Downloads last month
- 12
Model tree for SubMaroon/ControlNet-manga-recolor
Base model
John6666/nsfw-anime-xl-v1-sdxl