How to use your splitted GGUF models

#5
by tamburin - opened

I was able to use without problems your model "Llama-3.3-70B-Instruct-Q4_K_M.gguf": huggingface is able to find and download it without problems.
Now, I would like to use a larger model that is splitted, namely "Llama-3.3-70B-Instruct-Q6_K_L", but I have no idea how to load it. By using
model_path = hf_hub_download(model_name, filename=gguf_f)
I was able to download and use the first model, but with the second I get an error.
What is the right way to download and use splitted models? Should I merge the shards in some way?
Thanks

To download you'll want to run:

huggingface-cli download bartowski/Llama-3.3-70B-Instruct-GGUF --include "Llama-3.3-70B-Instruct-Q6_K_L/*" --local-dir ./

You can then just point your tool at the first part that gets downloaded, and it should pick up the second on its own

It works!
Thanks

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment