30B A3B GGUFs for Ollama and LMstudio?
#1
by
StatusQuo209
- opened
I know this is not a 4B question specifically and I didn't know where to ask, but I'm wondering if there are plans to release a 30B MOE model as well? Would be amazing for us using CPU inference.
If you do, just be sure to implement chat template correctly. Your other GGUFs don't work in LMstudio. Your Ollama run models are fine however.
Thank you. Your models are amazing.
Hey, definitely the 14B and small MoE model will be released too. The 14B will come later this week and the MoE will take longer hence it's a new architecture that needs some code adjustments.