GGUF?

#2
by Alastar-Smith - opened

Is there a way that we can use it in LMStudio as GGUF Qs?

I will try out to make it work, maybe the recent changes in llama.cpp make it possible (;

bad news currently there is more work needed

I have managed to create a Q8_0 gguf and a mmproj gguf, now i need to test inference

Is there a way that we can use in LMStudio as GGUF Qs?

are you currently online?

Is there a way that we can use in LMStudio as GGUF Qs?

are you currently online?

Sorry, I was sleeping.
Ready to test stuff!

I created a FP8 version for vLLM inference, should work on 16GiB VRAM cards.

Edit Misread your post, nevermind.

Is there a way that we can use in LMStudio as GGUF Qs?

are you currently online?

Sorry, I was sleeping.
Ready to test stuff!

didnt get it working yet, ill need to implement stuff for that in llama.cpp, if that will be successful idk xD

Sign up or log in to comment