AI RESEARCH

AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors

arXiv CS.CL

ArXi:2602.22755v3 Announce Type: replace