AQLM version please
Hello,
you have done a great job ! Can you post a quantized version in AQLM of this model please ? ( https://arxiv.org/abs/2401.06118 )
Thank you!
Thanks! I don't think it's feasible for me, unfortunately. It would take several days with an A100 and 80 GB of VRAM (see https://github.com/Vahe1994/AQLM?tab=readme-ov-file#quantization-time). I'd be happy if anyone has the compute to do it though. π
Thank you for your answer . It is ok :( , I understand , I don't have either a supercomputer at my disposal . By the way , just found a few days ago about the AQLM quantization method and now I found one that is supposed to be even better : QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models https://huggingface.co/papers/2310.16795 . Is it possible that you can try this one on the great model of yourself ...(the reason is to be more affordable to run on lower performance hardware ) ? Thank you again !