AI SAFETY & ETHICS

Metagaming matters for training, evaluation, and oversight

LessWrong AI • March 20, 2026

Following up on our previous work on verbalized eval awareness: we are sharing a post investigating the emergence of metagaming reasoning in a frontier

Read Full Article