CogniDet: Cognitive Faithfulness Detector for LLMs
CogniDet is a state-of-the-art model for detecting both factual and cognitive hallucinations in Large Language Model (LLM) outputs. Developed as part of the CogniBench framework, it specifically addresses the challenge of evaluating inference-based statements beyond simple fact regurgitation.
Key Features β¨
Dual Detection Capability
Identifies both:- Factual Hallucinations (claims contradicting provided context)
- Cognitive Hallucinations (unsupported inferences/evaluations)
Legal-Inspired Rigor
Incorporates a tiered evaluation framework (Rational β Grounded β Unequivocal) inspired by legal evidence standardsEfficient Inference
Single-pass detection with 8B parameter Llama3 backbone (faster than NLI-based methods)Large-Scale Training
Trained on CogniBench-L (24k+ dialogues, 234k+ annotated sentences)
Performance π
Detection Type | F1 Score |
---|---|
Overall | 70.30 |
Factual Hallucination | 64.40 |
Cognitive Hallucination | 73.80 |
Outperforms baselines like SelfCheckGPT (61.1 F1 on cognitive) and RAGTruth (45.3 F1 on factual)
Usage π»
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "future7/CogniDet"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
def detect_hallucinations(context, response):
inputs = tokenizer(
f"CONTEXT: {context}\nRESPONSE: {response}\nHALLUCINATIONS:",
return_tensors="pt"
)
outputs = model.generate(**inputs, max_new_tokens=100)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example usage
context = "Moringa trees grow in USDA zones 9-10. Flowering occurs annually in spring."
response = "In cold regions, Moringa can bloom twice yearly if grown indoors."
print(detect_hallucinations(context, response))
# Output: "Bloom frequency claims in cold regions are speculative"
Training Data π¬
Trained on CogniBench-L featuring:
- 7,058 knowledge-grounded dialogues
- 234,164 sentence-level annotations
- Balanced coverage across 15+ domains (Medical, Legal, etc.)
- Auto-labeled via rigorous pipeline (82.2% agreement with humans)
Limitations β οΈ
- Best performance on English knowledge-grounded dialogues
- Domain-specific applications (e.g., clinical diagnosis) may require fine-tuning
- Context window limited to 8K tokens
Citation π
If you use CogniDet, please cite the CogniBench paper:
@inproceedings{tang2025cognibench,
title = {CogniBench: A Legal-inspired Framework for Assessing Cognitive Faithfulness of LLMs},
author = {Tang, Xiaqiang and Li, Jian and Hu, Keyu and Nan, Du
and Li, Xiaolong and Zhang, Xi and Sun, Weigao and Xie, Sihong},
booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)},
year = {2025},
pages = {xxx--xxx}, % ζ·»ε ι‘΅η θε΄
publisher = {Association for Computational Linguistics},
location = {Vienna, Austria},
url = {https://arxiv.org/abs/2505.20767},
archivePrefix = {arXiv},
eprint = {2505.20767},
primaryClass = {cs.CL}
}
Resources π
- Downloads last month
- 66
Model tree for future7/CogniDet
Base model
meta-llama/Meta-Llama-3-8B