Kodify-Nano-GGUF 🤖

Kodify-Nano-GGUF - 4-битная квантизированная GGUF версия модели MTSAIR/Kodify-Nano, оптимизированная для CPU-инференса. Легковесная LLM для задач разработки кода с минимальными ресурсами.

Kodify-Nano-GGUF - 4-bit quantized GGUF version of MTSAIR/Kodify-Nano, optimized for CPU inference. Lightweight LLM for code development tasks with minimal resource requirements.

Download Models

Available quantization variants:

Kodify_Nano_q4_k_s.gguf (balanced)
Kodify_Nano_q8_0.gguf (high quality)
Kodify_Nano.gguf (best quality, unquantized)

Download using huggingface_hub:

pip install huggingface-hub
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='MTSAIR/Kodify-Nano-GGUF', filename='Kodify_Nano_q4_k_s.gguf', local_dir='./models')"

Using with Ollama

Install Ollama: https://ollama.com/download
Create Modelfile:

FROM ./models/Kodify_Nano_q4_k_s.gguf
PARAMETER temperature 0.4
PARAMETER top_p 0.8
PARAMETER num_ctx 8192
TEMPLATE """<s>[INST] {{ .System }} {{ .Prompt }} [/INST]"""

Create and run model: ollama create kodify-nano -f Modelfile ollama run kodify-nano "Write a Python function to check prime numbers"

Python Integration

Install Ollama Python library:

pip install ollama

Example code:

import ollama

response = ollama.generate(
    model="kodify-nano",
    prompt="Write a Python function to calculate factorial",
    options={
        "temperature": 0.4,
        "top_p": 0.8,
        "num_ctx": 8192
    }
)

print(response['response'])

Usage Examples

Code Generation

response = ollama.generate(
    model="kodify-nano",
    prompt="""<s>[INST] 
Write a Python function that:
1. Accepts a list of numbers
2. Returns the median value
[/INST]""",
    options={"max_tokens": 512}
)

### Code Refactoring
response = ollama.generate(
    model="kodify-nano",
    prompt="""<s>[INST] 
Refactor this Python code:

def calc(a,b):
    s = a + b
    d = a - b
    p = a * b
    return s, d, p
[/INST]""",
    options={"temperature": 0.3}
)