Spaces:

VPTQ-community
/

README

Running

Will this method of quantization appear in Ollama?

by Regrin - opened Oct 2, 2024

Oct 2, 2024

Hello!
I would really like to use models quantized to this state on my (very weak) computer.
Can you tell me if this method will be available in Ollama?

OpenSourceRonin

VPTQ-community org Oct 2, 2024

Hi Regin,
Thanks for your advice!
I am very interested in adapting this method for use with Ollama.
When using VPTQ or similar methods, what are your most important requirements?
What device are you using? This information would help us pinpoint the motivation for integrating it into Ollama.

OpenSourceRonin

VPTQ-community org Oct 2, 2024

The backend of Ollama is llama.cpp, and we could support VPTQ in llama.cpp as the first step.

Regrin

Oct 2, 2024

See, I'm using a relatively powerful laptop. That said, I only have 8GB of memory. I might build myself a server for LLM, but that's a question for tomorrow.
I need a very small model, at high quality.
I guess my priorities are RAG and programming.
In addition, I would like to train micro-models for my tasks. Is there any possibility to pre-train your quantized models? Something like QLoRA

OpenSourceRonin

VPTQ-community org Oct 2, 2024

See, I'm using a relatively powerful laptop. That said, I only have 8GB of memory. I might build myself a server for LLM, but that's a question for tomorrow.
I need a very small model, at high quality.
I guess my priorities are RAG and programming.
In addition, I would like to train micro-models for my tasks. Is there any possibility to pre-train your quantized models? Something like QLoRA

As the VPTQ maintainer mentioned, they will release quantization codes https://github.com/microsoft/VPTQ/issues/29, and I guess you can quantify your pre-trained model.
Or Integrate with QLoRA after quantization.

Thanks!

Regrin

Oct 2, 2024

So it will be possible to run and pre-train quantized models?

OpenSourceRonin

VPTQ-community org Oct 2, 2024

So it will be possible to run and pre-train quantized models?

Yes, please wait a few weeks.

Regrin

Oct 2, 2024

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment