Advances in Hallucination Mitigation and Uncertainty Quantification for Large Language Models

The field of large language models (LLMs) is moving towards addressing the long-standing issue of hallucinations, which refers to the generation of plausible yet factually incorrect information. Recent research has focused on developing innovative methods to mitigate hallucinations, including fine-tuning strategies, prompt refinement techniques, and uncertainty quantification approaches. These advancements aim to improve the reliability and trustworthiness of LLMs, particularly in high-stakes domains such as medicine and finance. Noteworthy papers in this area include the introduction of Curative Prompt Refinement (CPR), which significantly increases the quality of generation while mitigating hallucination, and the proposal of the Credal Transformer, which integrates uncertainty quantification directly into the model architecture. Additionally, research on uncertainty quantification has led to the development of novel methods, such as Retrieval-Augmented Reasoning Consistency (R2C) and Epistemic Uncertainty Quantification via Semantic-preserving Intervention (ESI), which provide more accurate estimates of model uncertainty.

Sources

Enhancing Faithfulness in Abstractive Summarization via Span-Level Fine-Tuning

Detecting Hallucinations in Authentic LLM-Human Interactions

ADVICE: Answer-Dependent Verbalized Confidence Estimation

Uncertainty Quantification for Retrieval-Augmented Reasoning

CPR: Mitigating Large Language Model Hallucinations with Curative Prompt Refinement

Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models

Uncertainty Quantification for Hallucination Detection in Large Language Models: Foundations, Methodology, and Future Directions

Credal Transformer: A Principled Approach for Quantifying and Mitigating Hallucinations in Large Language Models

Continuous Uniqueness and Novelty Metrics for Generative Modeling of Inorganic Crystals

Teaching Language Models to Faithfully Express their Uncertainty

COSTAR-A: A prompting framework for enhancing Large Language Model performance on Point-of-View questions

Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations

ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models

Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation

SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language Models