AI SAFETY & ETHICS

Model Spec Midtraining: Improving How Alignment Training Generalizes

LessWrong AI

Tl;dr We