Ling-V2
Collection
11 items
โข
Updated
โข
26
Use https://github.com/im0qianqian/llama.cpp to quantize.
For model inference, please download our release package from this url https://github.com/im0qianqian/llama.cpp/releases .
# Use a local model file
llama-cli -m my_model.gguf
# Launch OpenAI-compatible API server
llama-server -m my_model.gguf
Let's look forward to the following PR being merged:
2-bit
4-bit
6-bit
8-bit
16-bit
Base model
inclusionAI/Ling-mini-base-2.0