AI RESEARCH

Beyond Factual Accuracy: Evaluating Global Reasoning Integrity in RAG Systems with LogicScore

arXiv CS.CL

ArXi:2601.15050v4 Announce Type: replace Current evaluation methods for Retrieval Augmented Generation (RAG) suffer from \textit{factual myopia}: they relentlessly emphasize factual accuracy yet neglect global logical integrity in long-form answer generation. This drives models to force unnatural connections, producing factually grounded yet logically incoherent responses with unaddressed gaps, ambiguous links, or redundant premises. To mitigate this, we present \textsc{LogicScore}, shifting from local, fact-by-fact assessment to rigorous global reasoning scrutiny.