AI RESEARCH

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation

arXiv CS.CL • April 21, 2026

ArXi:2604.17020v1 Announce Type: new Static benchmarks for harmful content detection face limitations in scalability and diversity, and may also be affected by contamination from web-scale pre-

Read Full Article