Request to add more GGUF quantization size options

#1
by makisekurisu-jp - opened

I’d like to request the addition of more GGUF quantization size options to better support different hardware setups. Thank you!

@QuantStack Q50 Q51 Q40 Q41 PLEASE

FP16 and BF16 please, thank you!

QuantStack org

Ill do the Q5_0 one next, fp16 will come tonight (;

QuantStack org

I might have a different solution though, my org partner and me are currently testing out a new node, that basically allows to patch the vace addon like kijiai does on native unet models, meaning you can just load the gguf for example wan2.1t2v or skyreelsv2 or whatever and patch the vace patch to it and make it work like vace πŸ₯°

That's great news. I hope it might also fix Teacache being unusable in the native nodes. Currently, activating it will ignore the reference image.

QuantStack org

Currently there are still some bugs in it though, but either ill fix it or kijai will upload a node next week anyway

QuantStack org

OKay ive made progress, it works! But not on ggufs yet /:

I'd love a version that supports 4GB VRAM!

QuantStack org

thats probably not a good idea in itself, you should use distorch to offload to system ram, otherwise the generation looks like dogshit, even Q3 is pretty shit /:

QuantStack org

but if you want i can try to go lower, but idk if q2 is even possible

Sign up or log in to comment