AI RESEARCH
TIGER-FG: Text-Guided Implicit Fine-Grained Grounding for E-commerce Retrieval
arXiv CS.CV
•
ArXi:2605.18434v1 Announce Type: cross E-commerce image search often takes a cropped image as the query, while each candidate is represented by full item images and structured text. This image-to-multimodal retrieval setting presents two asymmetries: a modality disparity -- a visual query must match image--text items, and a granularity disparity -- a cropped query must be compared with full images containing background context and possible distractors.