SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

ArXi:2510.10073v2 Announce Type: replace-cross Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in real-world environments, they face serious security risks, motivating the design of security evaluation benchmarks. Existing benchmarks provide only partial coverage, typically restricted to narrow scenarios such as user-level prompt manipulation, and thus fail to capture the broad range of agent vulnerabilities.