AI RESEARCH
RaV-IDP: A Reconstruction-as-Validation Framework for Faithful Intelligent Document Processing
arXiv CS.CV
•
ArXi:2604.23644v1 Announce Type: new Intelligent document processing pipelines extract structured entities (tables, images, and text) from documents for use in downstream systems such as knowledge bases, retrieval-augmented generation, and analytics. A persistent limitation of existing pipelines is that extraction output is produced without any intrinsic mechanism to verify whether it faithfully represents the source. Model-internal confidence scores measure inference certainty, not correspondence to the document, and extraction errors pass silently into downstream consumers.