AI RESEARCH

Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)

arXiv CS.LG

ArXi:2510.06719v2 Announce Type: replace-cross Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by grounding them in external knowledge. However, its application in sensitive domains is limited by privacy risks. Existing private RAG methods typically rely on query-time differential privacy (DP), which requires repeated noise injection and leads to accumulated privacy loss. To address this issue, we propose DP-SynRAG, a framework that uses LLMs to generate differentially private synthetic RAG databases.