Damien420/granite-3.2-2b-instruct-abliterated-gguf-quantized

Model Overview

Model Name: granite-3.2-2b-instruct-abliterated-q4_k_m Base Model: Damien420/granite-3.2-2b-instruct-abliterated Repository: Damien420/granite-3.2-2b-instruct-abliterated-gguf-quantized Model Type: Instruction-tuned Language Model (Quantized) Parameter Count: 2 billion Format: GGUF (Quantized to Q4_K_M) Creator: Damien Chakma (Damien420) License: Apache 2.0 Date: March 3, 2025 Model Description This model is a quantized version of granite-3.2-2b-instruct-abliterated, a 2-billion-parameter instruction-tuned language model. It has been converted to the GGUF format and quantized to Q4_K_M using llama.cpp to optimize performance across a wide range of hardware, including low-resource devices. The quantization reduces memory usage while maintaining reasonable accuracy, making it suitable for deployment in resource-constrained environments.

Key Features

Quantization: Q4_K_M (4-bit quantization with medium precision), balancing performance and efficiency. Format: GGUF, compatible with tools like Ollama and llama.cpp. Intended Use: Instruction-following tasks, chat applications, and lightweight inference. Usage Prerequisites Install Ollama or llama.cpp for inference. Ensure sufficient disk space (~2-3 GB) and RAM (minimum 4 GB recommended).