AI RESEARCH
MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents
arXiv CS.AI
•
ArXi:2512.12634v3 Announce Type: replace Mobile GUI Agents, AI agents capable of interacting with mobile applications on behalf of users, have the potential to transform human computer interaction. However, current evaluation practices for GUI agents face two fundamental limitations. First, they either rely on single path offline benchmarks or online live benchmarks.