AI SAFETY & ETHICS

Claude knows who you are

LessWrong AI

Kelsey Piper noticed that Opus 4.7 is the first model which can identify her from her unpublished writing. I replicated the experiment myself, which is absolutely terrifying given that I am one of the most minor Internet personalities who has actually written stuff on the Internet. Claude professes not to know who I am, but reliably identifies me from my writing. Methodology: clear your custom instructions in claude.ai, and set your name to Unknown Visitor. Enter incognito chat mode with Claude.