AI SAFETY & ETHICS
Landmark new METR report: Can AIs already start ‘rogue deployments’ inside AI companies?
80000 Hours
•
A red-teamer was embedded inside Anthropic for three weeks, told to imagine he was an evil Claude, and asked to figure out how to launch a ‘rogue AI deployment’ without getting caught.