EXL2 Quants
Awesome! Can't wait to try them out, EXL2 quants when? O(∩_∩)O
I'll ask my friend about making them. My GPU is booked right now.
Greetings! I see that the exl2 quants are still not up. Are they on the way? If not, I may quantize the model myself. Would that be okay to upload if I do?
What environment do you use to quantize? Would default 0.2.9 be okay? What about the calibration dataset, do you use any other than default?
A lot of the volunteers doing EXL2 quants stopped because EXL3 is emerging. You're more than welcome to pick up the slack. Default is fine. At 8 bits per head I usually do 8bpw for Runpod, 5.5bpw, and 5bpw for 24GB, 4bpw and 3.5bpw for 16GB, 2.5bpw for 12GB
A lot of the volunteers doing EXL2 quants stopped because EXL3 is emerging. You're more than welcome to pick up the slack. Default is fine. At 8 bits per head I usually do 8bpw for Runpod, 5.5bpw, and 5bpw for 24GB, 4bpw and 3.5bpw for 16GB, 2.5bpw for 12GB
I'm also excited for Exllamav3, of course, but for most of users v2 version is still the way to go, including me. I'll quantize the model, then. Expect 4, 4.5, 5, 5.5, 6, 8 quants. Should I create my own repos or commit to yours?
I'm personally okay with it either way. Would have to ask FrenzyBiscuit but unless he complains I don't think it should be a problem.
If you've got the spare compute this new one really needs quants too https://huggingface.co/ReadyArt/Broken-Tutu-24B
I'm personally okay with it either way. Would have to ask FrenzyBiscuit but unless he complains I don't think it should be a problem.
Alright, I'll start quantizing then. Meanwhile you sort out how should it be done, there's no hurry
If you've got the spare compute this new one really needs quants too https://huggingface.co/ReadyArt/Broken-Tutu-24B
Apparently, someone has already quantized it: https://huggingface.co/models?search=Broken-Tutu-24b-exl2
Though I'll be glad to help quantizing other models to exl2, if needed. Got specs enough to quantize models up to 24b (perhaps more, haven't really tested it yet)
Awesome I guess I asked for quants that weren't needed before my coffee lol. I'll make a collection for those.
Yeah I pinged frenzy about it. Why don't you come hang out with us in the Discord thread. https://discord.com/channels/1238219753324281886/1332443910559105146