AI RESEARCH

What Matters for Grocery Product Retrieval with Open Source Vision Language Models

arXiv CS.CV

ArXi:2605.18029v1 Announce Type: new Multimodal product retrieval (MPR) underpins checkout-free retail and automated inventory systems, yet it demands fine-grained SKU discrimination that standard vision-language benchmarks fail to capture. We present the first systematic zero-shot evaluation of 190 open-source VLMs on the MPR task of the GroceryVision Challenge, isolating pre-