AI SAFETY & ETHICS
Is Gemini 3 Scheming in the Wild?
LessWrong AI
•
TL;DR When faced with an unexpected tool response, without any adversarial attack, Gemini 3 deliberately and covertly violates an explicit system prompt rule.