Constitutional Black-Box Monitoring for Scheming in LLM Agents (3 minute read)
TLDR AI
•
Generative AI
Black-box monitors can detect AI scheming using only observable actions, without internal insights.