AI RESEARCH

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

arXiv CS.AI

ArXi:2601.18113v3 Announce Type: replace-cross LLM-based web agents have become increasingly popular for their utility in daily life and work. However, they exhibit critical vulnerabilities when processing malicious URLs: accepting a disguised malicious URL enables subsequent access to unsafe webpages, which can cause severe damage to service providers and users. Despite this risk, no benchmark currently targets this emerging threat. To address this gap, we propose MalURLBench, the first benchmark for evaluating LLMs' vulnerabilities to malicious URLs.