AI RESEARCH

ReVision: Scaling Computer-Use Agents via Temporal Visual Redundancy Reduction

arXiv CS.CL

ArXi:2605.11212v1 Announce Type: new Computer-use agents~(CUAs) rely on visual observations of graphical user interfaces, where each screenshot is encoded into a large number of visual tokens. As interaction trajectories grow, the token cost increases rapidly, limiting the amount of history that can be incorporated under fixed context and compute budgets. This has resulted in no or very limited improvement in the performance when using history unlike other domains. We address this inefficiency by.