SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

ArXi:2510.17017v4 Announce Type: replace Large language model (LLM) based search agents iteratively generate queries, retrieve external information, and reason to answer open-domain questions. While researchers have primarily focused on improving their utility, their safety behaviors remain underexplored. In this paper, we first evaluate search agents using red-teaming datasets and find that they are likely to produce harmful outputs than base LLMs.