Share your test here

#2
by YarvixPA - opened
QuantStack org

We have uploaded the workflow with the GGUF loaders also in this repo here

hello thanks for the quants

workflow s2v no cloning.png

I have enhanced the workflow by incorporating a Scale Image To Pixels node and a custom Math Expression script. The latter calculation accurately determines the video length based on the audio length. This modification may be beneficial in certain applications.

hope it can help

workflow with cloning.png

same as above with integrated voice cloning with chatterbox srt voice node

also for fast generation its seem working with 4 steps

lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors

I also test I2V 14 B 4 steps lora (High and Noise) seems to works fine

https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1

I also test I2V 14 B 4 steps lora (High and Noise) seems to works fine

I dont understand, Did you use both high and low i2v loras in this workflow

QuantStack org

yes

I also test I2V 14 B 4 steps lora (High and Noise) seems to works fine

I dont understand, Did you use both high and low i2v loras in this workflow

image.png

I dont generated all the video for the audio

Interesting. Thanks

QuantStack org

@PiquantSalt is the with native nodes? Good job… The video has embedded the workflow?

@YarvixPA Yup, wf should be included. q8 is obviously superior. Sticking close to 16:9 seems to help. Native nodes + gguf (obviously) and some convenience nodes, but it should work with just native and gguf, too. Adjusted the settings a bit, euler beta57 is quick and pretty accurate.

lightx2v_t2v_14b_cfg_step_distill_v2_lora_rank256_bf16 lora for wan2.1 has given me the best results so far, others didn’t really work. It's still not 100% amazing, but at 1.5 strength it's "OK". Truncating first 1 or 2 frames (best 2 imo) improves the overall result.

Oh, and torch compile sometimes freaks out, but just clearing the cache seems to solve it.

@YarvixPA Yup, wf should be included. q8 is obviously superior. Sticking close to 16:9 seems to help. Native nodes + gguf (obviously) and some convenience nodes, but it should work with just native and gguf, too. Adjusted the settings a bit, euler beta57 is quick and pretty accurate.

lightx2v_t2v_14b_cfg_step_distill_v2_lora_rank256_bf16 lora for wan2.1 has given me the best results so far, others didn’t really work. It's still not 100% amazing, but at 1.5 strength it's "OK". Truncating first 1 or 2 frames (best 2 imo) improves the overall result.

Oh, and torch compile sometimes freaks out, but just clearing the cache seems to solve it.

Have u also tested the apative lightx2v?

@Sikaworld1990
Enjoy!

S2V gguf q8, cfg 1, 4 steps, 25 s/it, 77frames, seed 19, euler beta57, inductor, sage attention. LIMITED testing. The sync might be better on slower text (used relatively fast speech)

Because it's the best so far, I ran additional tests:
lightx2v_T2V_14B_cfg_step_distill_v2_lora_rank256_bf16.safetensors
1.0: mid (slight color shift, ~75% synced, very slight ghosting)
1.5: good (slight color shift, ~90% synced, great movement)
1.6-1.9: good (color shift increases, lips ~90% synced, movement ranges from great to good - probably solved with different seed)
2.0: mid (color fried, ~95% synced, great movement)

Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors
1.0: fail (face ghosting, synced ~75%)
1.5: mid/good (slight color shift, ~80% synced)
Expect similar results from 1.6 as above

lightx2v_14B_T2V_cfg_step_distill_lora_adaptive_rank_quantile_0.15_bf16.safetensors
1.0: mid (color shift, synced ~70%)
1.5: good (slight color shift, synced ~80%)
2.0: good (color shift, synced ~90%)
Similar performance to rank256_bf16

Wan2.2-Lightning_T2V-v1.1-A14B-4steps-lora_HIGH_fp16.safetensors
1.0: mid (color shift, synced ~60%)
1.5: fail (color fried, synced ~60%)

Wan2.2-Lightning_T2V-v1.1-A14B-4steps-lora_LOW_fp16.safetensors
1.0: fail (total ghosting)
1.5: fail (weirdly enough the motion and lip sync are great, it's just a ghost talking)
2.0: mid (slight ghosting, good movement) There might be something here! res_2s beta57 - fail, euler simple - mid, dpmpp_3m_sde beta57 - mid/fail.
2.5: fail (less ghosting, bad movement)

Wan2.2-low-T2V-A14B-4steps-lora-rank64-Seko-V1.1.safetensors
1.0: fail (total ghosting)
1.5: fail (total ghosting)

Wan21_PusaV1_Lora_14B_rank512_bf16.safetensors (as addon lora)
fail (no noticeable improvement with rank256_bf16)

@PiquantSalt why are you using the T2V-lora, when S2V should be more like I2V? I am using the low+high noise lightx2v-I2V-loras with acceptable results.

@PiquantSalt why are you using the T2V-lora, when S2V should be more like I2V? I am using the low+high noise lightx2v-I2V-loras with acceptable results.

I agree, in theory. The reason is cause my gen results were worse with i2v loras, and s2v loads both i2v and t2v loras normally. See above t2v lora result with the woman in armor video and upscale (no drift on face movement, no color drift at all). Let me run the full test on all my i2v variants and I'll come back with the exact results.

@PiquantSalt why are you using the T2V-lora, when S2V should be more like I2V? I am using the low+high noise lightx2v-I2V-loras with acceptable results.

I am using lighning high noise I2V and 1 of the "old" lightxv2 T2V mostly the 64'er rank. From my testing best combo to keep movement in a proper mode.

One thing I noticed with Wan is that it sucks at seamless even when using start and end frame images as it bakes in lighting, and/or color, changes which this drastically must have if we are to get beyond the memes. Not many are running around with 160-200GB of vram.

Sign up or log in to comment