How to use your splitted GGUF models
I was able to use without problems your model "Llama-3.3-70B-Instruct-Q4_K_M.gguf": huggingface is able to find and download it without problems.
Now, I would like to use a larger model that is splitted, namely "Llama-3.3-70B-Instruct-Q6_K_L", but I have no idea how to load it. By using
model_path = hf_hub_download(model_name, filename=gguf_f)
I was able to download and use the first model, but with the second I get an error.
What is the right way to download and use splitted models? Should I merge the shards in some way?
Thanks
To download you'll want to run:
huggingface-cli download bartowski/Llama-3.3-70B-Instruct-GGUF --include "Llama-3.3-70B-Instruct-Q6_K_L/*" --local-dir ./
You can then just point your tool at the first part that gets downloaded, and it should pick up the second on its own
It works!
Thanks