Advancements in Retrieval-Augmented Generation and Private Deep Learning

The field of retrieval-augmented generation and private deep learning is moving towards developing more efficient and effective methods for integrating external knowledge sources into large language models while preserving privacy. Researchers are exploring novel frameworks and algorithms that can filter out noisy documents, protect sensitive information, and improve the privacy-utility trade-off. Notable developments include the use of query-aware clustering, winnowing, and differential privacy guarantees to enhance the accuracy and privacy of generated responses.

Noteworthy papers include: Private-RAG, which proposes two DP-RAG algorithms for answering multiple queries with LLMs while keeping data private. DP-AdamW, which introduces a differentially private variant of the AdamW optimizer with DP bias correction for the second moment estimator. DP-PMLF, which integrates per-sample momentum with a low-pass filtering strategy to simultaneously mitigate DP noise and clipping bias.

Sources

Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

DP-AdamW: Investigating Decoupled Weight Decay and Bias Correction in Private Deep Learning

Enhancing DPSGD via Per-Sample Momentum and Low-Pass Filtering

Built with on top of