--- library_name: transformers pipeline_tag: text-generation tags: - text faithfulness - hallucination detection - RAG evaluation - cognitive statements - factual consistency datasets: - future7/CogniBench - future7/CogniBench-L language: - en base_model: - meta-llama/Meta-Llama-3-8B --- # CogniDet: Cognitive Faithfulness Detector for LLMs **CogniDet** is a state-of-the-art model for detecting **both factual and cognitive hallucinations** in Large Language Model (LLM) outputs. Developed as part of the [CogniBench](https://github.com/FUTUREEEEEE/CogniBench) framework, it specifically addresses the challenge of evaluating inference-based statements beyond simple fact regurgitation. ## Key Features โœจ 1. **Dual Detection Capability** Identifies both: - **Factual Hallucinations** (claims contradicting provided context) - **Cognitive Hallucinations** (unsupported inferences/evaluations) 2. **Legal-Inspired Rigor** Incorporates a tiered evaluation framework (Rational โ†’ Grounded โ†’ Unequivocal) inspired by legal evidence standards 3. **Efficient Inference** Single-pass detection with **8B parameter Llama3 backbone** (faster than NLI-based methods) 4. **Large-Scale Training** Trained on **CogniBench-L** (24k+ dialogues, 234k+ annotated sentences) ## Performance ๐Ÿš€ | Detection Type | F1 Score | |----------------------|----------| | **Overall** | 70.30 | | Factual Hallucination| 64.40 | | **Cognitive Hallucination** | **73.80** | *Outperforms baselines like SelfCheckGPT (61.1 F1 on cognitive) and RAGTruth (45.3 F1 on factual)* ## Usage ๐Ÿ’ป ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "future7/CogniDet" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) def detect_hallucinations(context, response): inputs = tokenizer( f"CONTEXT: {context}\nRESPONSE: {response}\nHALLUCINATIONS:", return_tensors="pt" ) outputs = model.generate(**inputs, max_new_tokens=100) return tokenizer.decode(outputs[0], skip_special_tokens=True) # Example usage context = "Moringa trees grow in USDA zones 9-10. Flowering occurs annually in spring." response = "In cold regions, Moringa can bloom twice yearly if grown indoors." print(detect_hallucinations(context, response)) # Output: "Bloom frequency claims in cold regions are speculative" ``` ## Training Data ๐Ÿ”ฌ Trained on **CogniBench-L** featuring: - 7,058 knowledge-grounded dialogues - 234,164 sentence-level annotations - Balanced coverage across 15+ domains (Medical, Legal, etc.) - Auto-labeled via rigorous pipeline (82.2% agreement with humans) ## Limitations โš ๏ธ 1. Best performance on **English** knowledge-grounded dialogues 2. Domain-specific applications (e.g., clinical diagnosis) may require fine-tuning 3. Context window limited to 8K tokens ## Citation ๐Ÿ“š If you use CogniDet, please cite the CogniBench paper: ```bibtex @inproceedings{tang2025cognibench, title = {CogniBench: A Legal-inspired Framework for Assessing Cognitive Faithfulness of LLMs}, author = {Tang, Xiaqiang and Li, Jian and Hu, Keyu and Nan, Du and Li, Xiaolong and Zhang, Xi and Sun, Weigao and Xie, Sihong}, booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)}, year = {2025}, pages = {xxx--xxx}, % ๆทปๅŠ ้กต็ ่Œƒๅ›ด publisher = {Association for Computational Linguistics}, location = {Vienna, Austria}, url = {https://arxiv.org/abs/2505.20767}, archivePrefix = {arXiv}, eprint = {2505.20767}, primaryClass = {cs.CL} } ``` ## Resources ๐Ÿ”— - [CogniBench GitHub](https://github.com/FUTUREEEEEE/CogniBench)