AI RESEARCH
From Scenes to Elements: Multi-Granularity Evidence Retrieval for Verifiable Multimodal RAG
arXiv CS.CL
•
ArXi:2605.15019v1 Announce Type: new Multimodal Retrieval-Augmented Generation (RAG) systems retrieve evidence at coarse granularities (entire images or scenes), creating a mismatch with fine-grained user queries and making failures unverifiable. We