Advances in Large Language Models for Knowledge-Intensive Tasks

The field of large language models (LLMs) is rapidly evolving, with a focus on improving their performance on knowledge-intensive tasks. Recent developments have centered around enhancing the ability of LLMs to incorporate external knowledge, mitigate hallucination, and resolve conflicts between different sources of information. One key area of research is the use of retrieval-augmented generation (RAG) to enhance LLMs with relevant and up-to-date information. However, this approach can also introduce challenges, such as conflicting information and uncertainty, which must be addressed through innovative solutions. Notable papers in this area have proposed novel frameworks for resolving knowledge conflicts, detecting uncertainties, and improving the faithfulness of LLMs to the retrieved context. Overall, the field is moving towards more sophisticated and reliable LLMs that can effectively integrate external knowledge and reason about uncertainties. Noteworthy papers include: FaithfulRAG, which proposes a novel framework for resolving knowledge conflicts by explicitly modeling discrepancies between the model's parametric knowledge and retrieved context. AbstentionBench, which introduces a large-scale benchmark for evaluating the ability of LLMs to abstain from answering unanswerable questions. ThinkQE, which proposes a test-time query expansion framework that encourages deeper and comprehensive semantic exploration. Query-Level Uncertainty, which introduces a method for detecting knowledge boundaries via query-level uncertainty. Reasoning Models Are More Easily Gaslighted Than You Think, which systematicaly evaluates the ability of reasoning models to withstand misleading user input.

Sources

On the Merits of LLM-Based Corpus Enrichment

Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge

Conservative Bias in Large Language Models: Measuring Relation Predictions

DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation

AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions

The Curious Language Model: Strategic Test-Time Information Acquisition

ThinkQE: Query Expansion via an Evolving Thinking Process

Query-Level Uncertainty in Large Language Models

Reasoning Models Are More Easily Gaslighted Than You Think

Uncertainty-Aware Deep Learning for Automated Skin Cancer Classification: A Comprehensive Evaluation

Conversational Search: From Fundamentals to Frontiers in the LLM Era

Different Questions, Different Models: Fine-Grained Evaluation of Uncertainty and Calibration in Clinical QA with LLMs

Built with on top of