AI SAFETY & ETHICS
A Toy Environment For Exploring Reasoning About Reward
Alignment Forum
•
Tldr: We share a toy environment that we found useful for understanding how reasoning changed over the