From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

ArXi:2510.04607v2 Announce Type: replace-cross Computer-use agents (CUAs) powered by large language models (LLMs) have emerged as a promising approach to automating computer tasks, yet they struggle with the existing human-oriented OS interfaces - graphical user interfaces (GUIs). GUIs force LLMs to decompose high-level goals into lengthy, error-prone sequences of fine-grained actions, resulting in low success rates and an excessive number of LLM calls.