Request to add more GGUF quantization size options

by makisekurisu-jp - opened May 17

May 17

I’d like to request the addition of more GGUF quantization size options to better support different hardware setups. Thank you!

makisekurisu-jp

May 17

•

edited May 17

@QuantStack Q50 Q51 Q40 Q41 PLEASE

ray228

May 17

FP16 and BF16 please, thank you!

wsbagnsv1

QuantStack org May 17

Ill do the Q5_0 one next, fp16 will come tonight (;

wsbagnsv1

QuantStack org May 17

I might have a different solution though, my org partner and me are currently testing out a new node, that basically allows to patch the vace addon like kijiai does on native unet models, meaning you can just load the gguf for example wan2.1t2v or skyreelsv2 or whatever and patch the vace patch to it and make it work like vace 🥰

gkeNz

May 18

That's great news. I hope it might also fix Teacache being unusable in the native nodes. Currently, activating it will ignore the reference image.

wsbagnsv1

QuantStack org May 18

Currently there are still some bugs in it though, but either ill fix it or kijai will upload a node next week anyway

wsbagnsv1

QuantStack org May 19

OKay ive made progress, it works! But not on ggufs yet /:

Satougg4

May 24

I'd love a version that supports 4GB VRAM!

wsbagnsv1

QuantStack org May 24

thats probably not a good idea in itself, you should use distorch to offload to system ram, otherwise the generation looks like dogshit, even Q3 is pretty shit /:

wsbagnsv1

QuantStack org May 24

but if you want i can try to go lower, but idk if q2 is even possible

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment