AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery

ArXi:2604.25256v1 Announce Type: new Autonomous scientific research is significantly advanced thanks to the development of AI agents. One key step in this process is finding the right scientific literature, whether to explore existing knowledge for a research problem, or to acquire evidence for verifying assumptions and ing claims. To assess AI agents' capability in driving this process, we present AutoResearchBench, a dedicated benchmark for autonomous scientific literature discovery.