AI RESEARCH

MCP-Atlas: A Large-Scale Benchmark for Tool-Use Competency with Real MCP Servers

arXiv CS.AI

ArXi:2602.00933v2 Announce Type: replace-cross The Model Context Protocol (MCP) is rapidly becoming the standard interface for Large Language Models (LLMs) to discover and invoke external tools. However, existing evaluations often fail to capture the complexity of real-world scenarios, relying on restricted toolsets, simplistic workflows, or subjective LLM-as-a-judge metrics. We