The field of retrieval-augmented generation is moving towards more efficient and effective methods of incorporating external knowledge into large language models. Recent developments have focused on improving the retrieval process, with a shift towards utility-based retrieval and adaptive context compression. This has led to significant improvements in generation performance and reduced computational costs. Noteworthy papers include:
- SelfRACG, which enables large language models to self-express their information needs, resulting in superior generation performance.
- Distilling a Small Utility-Based Passage Selector, which proposes a method to distill the utility judgment capabilities of large language models into smaller, more efficient models.
- Enhancing Project-Specific Code Completion, which infers internal API information without relying on imports, significantly outperforming existing methods.
- Enhancing RAG Efficiency with Adaptive Context Compression, which dynamically adjusts compression rates based on input complexity, optimizing inference efficiency without sacrificing accuracy.