AI RESEARCH
MARA: A Multimodal Adaptive Retrieval-Augmented Framework for Document Question Answering
arXiv CS.CL
•
ArXi:2604.16313v1 Announce Type: cross Retrieval-based multimodal document QA aims to identify and integrate relevant information from visually rich documents with complex multimodal structures. While retrieval-augmented generation (RAG) has shown strong performance in text-based QA, its extensions to multimodal documents remain underexplored and face significant limitations.