Predicting When RL Training Breaks Chain-of-Thought Monitorability (8 minute read)

TLDR AI
Generative AI Reinforcement Learning

Researchers propose a framework that predicts when RL