Advances in Multi-Hop Question Answering and Retrieval-Augmented Generation

The field of natural language processing is moving towards more complex and realistic question answering tasks, with a focus on multi-hop reasoning and retrieval-augmented generation. Recent developments have led to the creation of new benchmarks and datasets that challenge models to integrate information across multiple sources and generate long-form responses. Additionally, there is a growing interest in improving the efficiency and effectiveness of retrieval-augmented generation models, with techniques such as prompt compression, lossless compression, and context-adaptive synthesis and compression being explored. Noteworthy papers in this area include DocHop-QA, which proposes a large-scale benchmark for multimodal, multi-document, multi-hop question answering, and CORE, which presents a novel method for lossless compression of retrieved documents using reinforcement learning. Other notable papers include SCOPE, which introduces a generative approach for prompt compression, and CASC, which proposes a context-adaptive synthesis and compression framework for enhanced retrieval-augmented generation in complex domains.

Sources

DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections

SCOPE: A Generative Approach for LLM Prompt Compression

ReFactX: Scalable Reasoning with Reliable Facts via Constrained Generation

GRADE: Generating multi-hop QA and fine-gRAined Difficulty matrix for RAG Evaluation

Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering

Retrieval Feedback Memory Enhancement Large Model Retrieval Generation Method

Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering

Improving End-to-End Training of Retrieval-Augmented Generation Models via Joint Stochastic Approximation

Retrieval-Augmented Generation for Natural Language Art Provenance Searches in the Getty Provenance Index

CORE: Lossless Compression for Retrieval-Augmented LLMs via Reinforcement Learning

Context-Adaptive Synthesis and Compression for Enhanced Retrieval-Augmented Generation in Complex Domains

Automatic Question & Answer Generation Using Generative Large Language Model (LLM)

MSRS: Evaluating Multi-Source Retrieval-Augmented Generation