A Wan 2.2 14B i2v LoRA for emulating the City the Animation movement style.

Note:

  • I expected the training run to produce a high and a low noise .safetensors, but it did not, so I tried it as just the high-noise LoRA, and it worked fine that way.
  • The dataset focuses on action scenes
  • the dataset was captioned by a custom-written video captioning tool

Trained using musubi-trainer with the following settings for Wan 2.2 i2v

accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 src/musubi_tuner/wan_train_network.py \
    --task i2v-A14B \
    --dit_high_noise /home/anon/Documents/ComfyUI/models/diffusion_models/wan2.2/wan2.2_i2v_high_noise_14B_fp16.safetensors \
    --dit /home/anon/Documents/ComfyUI/models/diffusion_models/wan2.2/wan2.2_i2v_low_noise_14B_fp16.safetensors \
    --dataset_config /home/anon/Documents/musubi-tuner/data/city-video-cfg/city-video-dataset.toml --sdpa --mixed_precision fp16 --fp8_base \
    --optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --gradient_accumulation_steps 1  \
    --max_data_loader_n_workers 2 --persistent_data_loader_workers --offload_inactive_dit \
    --network_module networks.lora_wan --network_dim 32 \
    --timestep_sampling shift --timestep_boundary 900 --min_timestep 0 --max_timestep 1000 --discrete_flow_shift 3.0 \
    --max_train_epochs 16 --save_every_n_epochs 1 --seed 23571113 \
    --save_state \
    --output_dir /home/anon/Documents/musubi-tuner/data/city-video-output/ --output_name wan2.2-14b-i2v-city.safetensors \
    --logging_dir /home/anon/Documents/musubi-tuner/data/city-video-logs

using a dataset consisting of 101 640x360 15-second videos plus annotations. The LoRA took 5.5 days using a 4090D 48GB GPU. I wanted to train at 836x480 but it would always OOM before completing an epoch, so I scaled the video down to 640x360, which is still a common SD resolution.

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for quarterturn/wan2.2-14b-i2v-city-the-animation

Adapter
(13)
this model