MobileIPL: Enhancing Mobile Agents Thinking Process via Iterative Preference Learning

ArXi:2505.12299v4 Announce Type: replace-cross The Chain of Action-Planning Thoughts (CoaT) paradigm has been shown to improve the reasoning performance of VLM-based mobile agents in GUI tasks. However, the scarcity of diverse CoaT trajectories limits the expressiveness and generalization ability of such agents. While self-