gguf

#17

by daisr - opened Jul 20, 2024

Discussion

daisr

Jul 20, 2024

gguf, pls

IndrasMirror

Jul 20, 2024

Ollama, pls :)

deleted

Jul 20, 2024

its out there if you search.

LeroyDyer

Jul 20, 2024

its out there if you search.

ok its not on gguf yet as it cannot be converted so easy ?

deleted

Jul 20, 2024

•

edited Jul 20, 2024

Did you even bother searching? I see more than one in a simple search.

EDIT even one for ollama ( and ollama will import gguf, at least it does for me )

turentado

Jul 20, 2024

llama.cpp doesn't support this model yet

deleted

Jul 20, 2024

llama.cpp doesn't support this model yet

its in one of the branches now. Personally i'm waiting until its released, but its there.

sm54

Jul 21, 2024

Try the exl2 quants, they work, I'm using Turboderps 8.0bpw version and can run it on Text generation webui with 128k context (at 8bit cache) within 24gb of gpu memory. It's a good model.

LeroyDyer

Jul 21, 2024

it should be fine now !!

itsin unsloth and in the llama cpp ( they had to update the embeddings)

bartowski

Jul 22, 2024

Llama.cpp PR got merged

deleted

Jul 22, 2024

Llama.cpp PR got merged

cool time to look then :)

ZeroWw

Jul 22, 2024

Mine works:

ZeroWw/Mistral-Nemo-Instruct-2407-GGUF

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment