REFIND: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models
Abstract
Hallucinations in large language model (LLM) outputs severely limit their reliability in knowledge-intensive tasks such as question answering. To address this challenge, we introduce REFIND (Retrieval-augmented Factuality hallucINation Detection), a novel framework that detects hallucinated spans within LLM outputs by directly leveraging retrieved documents. As part of the REFIND, we propose the Context Sensitivity Ratio (CSR), a novel metric that quantifies the sensitivity of LLM outputs to retrieved evidence. This innovative approach enables REFIND to efficiently and accurately detect hallucinations, setting it apart from existing methods. In the evaluation, REFIND demonstrated robustness across nine languages, including low-resource settings, and significantly outperformed baseline models, achieving superior IoU scores in identifying hallucinated spans. This work highlights the effectiveness of quantifying context sensitivity for hallucination detection, thereby paving the way for more reliable and trustworthy LLM applications across diverse languages.
Community
REFIND is a retrieval-augmented framework for detecting hallucinated spans in LLM outputs by leveraging retrieved documents. It introduces Context Sensitivity Ratio, a metric quantifying LLM sensitivity to evidence. REFIND outperforms baselines across nine languages, including low-resource settings, achieving superior hallucination detection accuracy. Our code is publicly available at https://github.com/oneonlee/REFIND.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SelfCheckAgent: Zero-Resource Hallucination Detection in Generative Large Language Models (2025)
- How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild (2025)
- Delta -- Contrastive Decoding Mitigates Text Hallucinations in Large Language Models (2025)
- Can Your Uncertainty Scores Detect Hallucinated Entity? (2025)
- Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models (2025)
- CutPaste&Find: Efficient Multimodal Hallucination Detector with Visual-aid Knowledge Base (2025)
- Fixing Imbalanced Attention to Mitigate In-Context Hallucination of Large Vision-Language Model (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper