MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents

ArXi:2512.12634v3 Announce Type: replace Mobile GUI Agents, AI agents capable of interacting with mobile applications on behalf of users, have the potential to transform human computer interaction. However, current evaluation practices for GUI agents face two fundamental limitations. First, they either rely on single path offline benchmarks or online live benchmarks.