Hallucination Detection and Mitigation in Large Language Models

The field of natural language processing is moving towards developing more reliable and trustworthy large language models (LLMs). A key challenge in this area is the detection and mitigation of hallucinations, which are false or inaccurate statements generated by the model. Recent research has focused on developing innovative methods for detecting and preventing hallucinations, including the use of token-level entropy, causal intervention, and head-adaptive gating. These approaches have shown promising results in improving the accuracy and reliability of LLMs. Notably, some papers have proposed novel frameworks for hallucination detection, such as leveraging token-level entropy and integrating it into a conformal prediction pipeline. Others have introduced techniques for mitigating hallucinations, including head-adaptive gating and value calibration. Overall, the field is advancing towards more robust and trustworthy LLMs. Noteworthy papers include: TECP, which introduces a novel framework for uncertainty quantification in LLMs, and HAVE, which presents a parameter-free decoding framework for hallucination mitigation.

Sources

TECP: Token-Entropy Conformal Prediction for LLMs

Causal Interpretation of Sparse Autoencoder Features in Vision

Do small language models generate realistic variable-quality fake news headlines?

Real-Time Detection of Hallucinated Entities in Long-Form Generation

Learned Hallucination Detection in Black-Box LLMs using Token-level Entropy Production Rate

Why Language Models Hallucinate

time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models

HAVE: Head-Adaptive Gating and ValuE Calibration for Hallucination Mitigation in Large Language Models

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

Built with on top of