30B A3B GGUFs for Ollama and LMstudio?

by StatusQuo209 - opened 9 days ago

9 days ago

I know this is not a 4B question specifically and I didn't know where to ask, but I'm wondering if there are plans to release a 30B MOE model as well? Would be amazing for us using CPU inference.

If you do, just be sure to implement chat template correctly. Your other GGUFs don't work in LMstudio. Your Ollama run models are fine however.

Thank you. Your models are amazing.

Goekdeniz-Guelmez

Owner 6 days ago

•

edited 6 days ago

Hey, definitely the 14B and small MoE model will be released too. The 14B will come later this week and the MoE will take longer hence it's a new architecture that needs some code adjustments.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment