QuantStack/Wan2.2-S2V-14B-GGUF · Share your test here

QuantStack org Aug 27

We have uploaded the workflow with the GGUF loaders also in this repo here

Aug 27

hello thanks for the quants

I have enhanced the workflow by incorporating a Scale Image To Pixels node and a custom Math Expression script. The latter calculation accurately determines the video length based on the audio length. This modification may be beneficial in certain applications.

hope it can help

AI-Joe-git

Aug 27

same as above with integrated voice cloning with chatterbox srt voice node

AI-Joe-git

Aug 27

also for fast generation its seem working with 4 steps

lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors

YarvixPA

QuantStack org Aug 27

•

edited Aug 27

I also test I2V 14 B 4 steps lora (High and Noise) seems to works fine

https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1

pheonis

Aug 27

I also test I2V 14 B 4 steps lora (High and Noise) seems to works fine

I dont understand, Did you use both high and low i2v loras in this workflow

YarvixPA

QuantStack org Aug 27

yes

I also test I2V 14 B 4 steps lora (High and Noise) seems to works fine

I dont understand, Did you use both high and low i2v loras in this workflow

I dont generated all the video for the audio

pheonis

Aug 27

Interesting. Thanks

57 hidden messages

Expand all

YarvixPA

QuantStack org Aug 30

@PiquantSalt is the with native nodes? Good job… The video has embedded the workflow?

PiquantSalt

Aug 30

•

edited Aug 30

@YarvixPA Yup, wf should be included. q8 is obviously superior. Sticking close to 16:9 seems to help. Native nodes + gguf (obviously) and some convenience nodes, but it should work with just native and gguf, too. Adjusted the settings a bit, euler beta57 is quick and pretty accurate.

lightx2v_t2v_14b_cfg_step_distill_v2_lora_rank256_bf16 lora for wan2.1 has given me the best results so far, others didn’t really work. It's still not 100% amazing, but at 1.5 strength it's "OK". Truncating first 1 or 2 frames (best 2 imo) improves the overall result.

Oh, and torch compile sometimes freaks out, but just clearing the cache seems to solve it.

Sikaworld1990

Aug 30

@YarvixPA Yup, wf should be included. q8 is obviously superior. Sticking close to 16:9 seems to help. Native nodes + gguf (obviously) and some convenience nodes, but it should work with just native and gguf, too. Adjusted the settings a bit, euler beta57 is quick and pretty accurate.

lightx2v_t2v_14b_cfg_step_distill_v2_lora_rank256_bf16 lora for wan2.1 has given me the best results so far, others didn’t really work. It's still not 100% amazing, but at 1.5 strength it's "OK". Truncating first 1 or 2 frames (best 2 imo) improves the overall result.

Oh, and torch compile sometimes freaks out, but just clearing the cache seems to solve it.

Have u also tested the apative lightx2v?

PiquantSalt

Sep 2

@Sikaworld1990
Enjoy!

S2V gguf q8, cfg 1, 4 steps, 25 s/it, 77frames, seed 19, euler beta57, inductor, sage attention. LIMITED testing. The sync might be better on slower text (used relatively fast speech)

Because it's the best so far, I ran additional tests:
lightx2v_T2V_14B_cfg_step_distill_v2_lora_rank256_bf16.safetensors
1.0: mid (slight color shift, ~75% synced, very slight ghosting)
1.5: good (slight color shift, ~90% synced, great movement)
1.6-1.9: good (color shift increases, lips ~90% synced, movement ranges from great to good - probably solved with different seed)
2.0: mid (color fried, ~95% synced, great movement)

Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors
1.0: fail (face ghosting, synced ~75%)
1.5: mid/good (slight color shift, ~80% synced)
Expect similar results from 1.6 as above

lightx2v_14B_T2V_cfg_step_distill_lora_adaptive_rank_quantile_0.15_bf16.safetensors
1.0: mid (color shift, synced ~70%)
1.5: good (slight color shift, synced ~80%)
2.0: good (color shift, synced ~90%)
Similar performance to rank256_bf16

Wan2.2-Lightning_T2V-v1.1-A14B-4steps-lora_HIGH_fp16.safetensors
1.0: mid (color shift, synced ~60%)
1.5: fail (color fried, synced ~60%)

Wan2.2-Lightning_T2V-v1.1-A14B-4steps-lora_LOW_fp16.safetensors
1.0: fail (total ghosting)
1.5: fail (weirdly enough the motion and lip sync are great, it's just a ghost talking)
2.0: mid (slight ghosting, good movement) There might be something here! res_2s beta57 - fail, euler simple - mid, dpmpp_3m_sde beta57 - mid/fail.
2.5: fail (less ghosting, bad movement)

Wan2.2-low-T2V-A14B-4steps-lora-rank64-Seko-V1.1.safetensors
1.0: fail (total ghosting)
1.5: fail (total ghosting)

Wan21_PusaV1_Lora_14B_rank512_bf16.safetensors (as addon lora)
fail (no noticeable improvement with rank256_bf16)

johnhampel

Sep 2

@PiquantSalt why are you using the T2V-lora, when S2V should be more like I2V? I am using the low+high noise lightx2v-I2V-loras with acceptable results.

PiquantSalt

Sep 2

•

edited Sep 2

@PiquantSalt why are you using the T2V-lora, when S2V should be more like I2V? I am using the low+high noise lightx2v-I2V-loras with acceptable results.

I agree, in theory. The reason is cause my gen results were worse with i2v loras, and s2v loads both i2v and t2v loras normally. See above t2v lora result with the woman in armor video and upscale (no drift on face movement, no color drift at all). Let me run the full test on all my i2v variants and I'll come back with the exact results.

Sikaworld1990

Sep 3

@PiquantSalt why are you using the T2V-lora, when S2V should be more like I2V? I am using the low+high noise lightx2v-I2V-loras with acceptable results.

I am using lighning high noise I2V and 1 of the "old" lightxv2 T2V mostly the 64'er rank. From my testing best combo to keep movement in a proper mode.

GeneralAwareness

30 days ago

One thing I noticed with Wan is that it sucks at seamless even when using start and end frame images as it bakes in lighting, and/or color, changes which this drastically must have if we are to get beyond the memes. Not many are running around with 160-200GB of vram.