Improving Large Language Model Reliability

The field of large language models (LLMs) is moving towards improving their reliability and trustworthiness. Recent developments focus on addressing the limitations of LLMs, such as their tendency to hallucinate, produce confident yet incorrect answers, and struggle with self-consistent errors. Researchers are exploring innovative methods to enhance LLM confidence estimation, detect errors, and evaluate their performance. Noteworthy papers include:

  • A study on calibrating LLM confidence by probing perturbed representation stability, which significantly outperforms current approaches.
  • A proposal for a stochastic method of moments evaluation to account for prompt sensitivity in LLM evaluation.
  • An investigation into the factuality gap between fine-tuned LLMs and the role of in-context learning prompts in mitigating this gap.

Sources

How Knowledge Popularity Influences and Enhances LLM Knowledge Boundary Perception

Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs

Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing

Calibrating LLM Confidence by Probing Perturbed Representation Stability

ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions

Built with on top of