MarketBench: Evaluating AI Agents as Market Participants

ArXi:2604.23897v1 Announce Type: new Markets are a promising way to coordinate AI agent activity for similar reasons to those used to justify markets broadly. In order to effectively participate in markets, agents need to have informative signals of their own ability to successfully complete a task and the cost of doing so. We propose MarketBench, a benchmark for assessing whether AI agents have these capabilities. We use a 93-task subset of SWE-bench Lite, a software engineering benchmark, with six recently released LLMs as a nstration.