Transformers
GGUF
mixtral

Can't get it working in ollama or llama.cpp but it works in Backyard AI on Apple Silicon

#1
by DopeSwag - opened

load_tensors: loading model tensors, this can take a while... (mmap = true)
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/user/ollama-models/mixtral.8x7b.sensualize-mixtral.gguf_v2.q4_k_m.gguf'
main: error: unable to load model

load_tensors: loading model tensors, this can take a while... (mmap = true)
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/user/ollama-models/mixtral.8x7b.sensualize-mixtral.gguf_v2.q4_k_m.gguf'
main: error: unable to load model

This quant is more than a year old, and there was a breaking change regarding how Llama.cpp handles MoE models some months ago see this post for more details:
https://huggingface.co/posts/bartowski/894091265291588

Sign up or log in to comment