AI SAFETY & ETHICS

Anthropic Responsible Scaling Policy v3: A Matter of Trust

LessWrong AI

Anthropic has revised its Responsible Scaling Policy to v3. The changes involved include abandoning many previous commitments, including one not to move ahead if doing so would be dangerous, citing that given competition they feel blindly following such a principle would not make the world safer. Holden Karnofsky advocated for the changes. He maintains that the previous strategy of specific commitments was in error, and instead endorses the new strategy of having aspirational goals. He was not at Anthropic when the commitments were made. My response to this will be two parts.