AI RESEARCH
MARCA: A Checklist-Based Benchmark for Multilingual Web Search
arXiv CS.CL
•
ArXi:2604.14448v1 Announce Type: new Large language models (LLMs) are increasingly used as sources of information, yet their reliability depends on the ability to search the web, select relevant evidence, and synthesize complete answers. While recent benchmarks evaluate web-browsing and agentic tool use, multilingual settings, and Portuguese in particular, remain underexplored. We present \textsc{MARCA}, a bilingual (English and Portuguese) benchmark for evaluating LLMs on web-based information seeking.