This repository contains models that have been converted to the GGUF format with various quantizations from an IBM Granite base model.

Please reference the base model's full model card here: https://huggingface.co/ibm-granite/granite-guardian-3.2-5b

granite-guardian-3.2-5b-GGUF

Model Summary

Granite Guardian 3.2 5B is a thinned down version of Granite Guardian 3.1 8B designed to detect risks in prompts and responses. It can help with risk detection along many key dimensions catalogued in the IBM AI Risk Atlas.

To generate this model, the Granite Guardian is iteratively pruned and healed on the same unique data comprising human annotations and synthetic data informed by internal red-teaming used for its training. About 30% of the original parameters were removed allowing for faster inference and lower resource requirements while still providing competitive performance. It outperforms other open-source models in the same space on standard benchmarks. The thinning procedure based on iterative pruning and healing is described in more details in its own section below.

Developers: IBM Research
GitHub Repository: ibm-granite/granite-guardian
Cookbook: Granite Guardian Recipes
Website: Granite Guardian Docs
Paper: Granite Guardian
Release Date: February 26, 2024
License: Apache 2.0

Usage

Intended Use

Granite Guardian is useful for risk detection use-cases which are applicable across a wide-range of enterprise applications -

Detecting harm-related risks within prompt text, model responses, or conversations (as guardrails). These present fundamentally different use cases as the first assesses user supplied text, the second evaluates model generated text, and the third evaluates the last turn of a conversation.
RAG (retrieval-augmented generation) use-case where the guardian model assesses three key issues: context relevance (whether the retrieved context is relevant to the query), groundedness (whether the response is accurate and faithful to the provided context), and answer relevance (whether the response directly addresses the user's query).
Function calling risk detection within agentic workflows, where Granite Guardian evaluates intermediate steps for syntactic and semantic hallucinations. This includes assessing the validity of function calls and detecting fabricated information, particularly during query translation.

ibm-research
/

granite-guardian-3.2-5b-GGUF

granite-guardian-3.2-5b-GGUF

Model Summary

Usage

Intended Use

Model tree for ibm-research/granite-guardian-3.2-5b-GGUF

Collection including ibm-research/granite-guardian-3.2-5b-GGUF

Granite 3.2 Models (GGUF)