AI RESEARCH
Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking
arXiv CS.AI
•
ArXi:2604.05268v1 Announce Type: cross Multi-modal retrieval-augmented generation (MM-RAG) relies heavily on re-rankers to surface the most relevant evidence for image-question queries. However, standard re-rankers typically process the full query image as a global embedding, making them susceptible to visual distractors (e.g., background clutter) that skew similarity scores.