AI SAFETY & ETHICS
Ablating Split Personality Training
LessWrong AI
•
I was part of the SPAR team that worked on Split Personality