Unkown model arch H1

#1
by supercharge19 - opened

I got this error:

llama_model_loader: - type  f32:  217 tensors
llama_model_loader: - type q5_0:  217 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q5_0
print_info: file size   = 1.01 GiB (5.60 BPW)
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'falcon-h1'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model './models/Falcon-H1-1B-Instruct-Q5_0.gguf'
srv    load_model: failed to load model, './models/Falcon-H1-1B-Instruct-Q5_0.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

I have built llama.cpp with the fork (link was in model card) so it should be working, however, I realized that model was recently uploaded (9 days ago from now) and latest change in repo was a week ago. So, I doubt that support is properly added. Kindly add support for these (h1) models.

By the way what does H1 mean.

Technology Innovation Institute org

@ybelkada (from the Falcon team) and I have reworked the FalconH1 implementation for llama.cpp, with revisions from the core contributors. Our updated implementation has now been successfully merged into the main/master branch of llama.cpp.

Feel free to pull the latest changes, rebuild llama.cpp, and enjoy experimenting with FalconH1! So far, we’ve uploaded the new GGUFs—currently, only the 34B version is still uploading.

JingweiZuo changed discussion status to closed

Sign up or log in to comment