AI RESEARCH

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

arXiv CS.AI

ArXi:2605.12481v1 Announce Type: new Computer Use Agents (CUAs) can act through both atomic GUI actions, such as click and type, and high-level tool calls, such as API-based file operations, but this hybrid action space often leaves them uncertain about when to continue with GUI actions or switch to tools, leading to suboptimal execution paths. This difficulty stems from the scarcity of high-quality interleaved GUI-Tool trajectories, the cost and brittleness of collecting real tool trajectories, and the lack of trajectory-level supervision for GUI-Tool path selection.