AI RESEARCH

Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv CS.AI

ArXi:2601.13227v2 Announce Type: replace-cross RAG systems are increasingly evaluated and optimized using LLM judges, an approach that is rapidly becoming the dominant paradigm for system assessment. Nugget-based approaches in particular are now embedded not only in evaluation frameworks but also in the architectures of RAG systems themselves. While this integration can lead to genuine improvements, it also creates a risk of faulty measurements due to circularity.