WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

ArXi:2603.25226v1 Announce Type: cross The emergence of Large Language Models (LLMs) has catalyzed a paradigm shift in programming, giving rise to "vibe coding", where users can build complete projects and even control computers using natural language instructions. This paradigm has driven automated webpage development, but it