AI SAFETY & ETHICS

Goblin Mode, 24 Hours Later

LessWrong AI

Yesterday, Twitter user arb8020 posted this: It went semi-viral within AI Twitter and users began experimenting with "goblin mode" and hypothesizing about the source of the bizarre behavior. LM Arena provided evidence for the phenomenon from their traffic: "It's true. Here's a plot of GPT models and their usage of 'goblin', 'gremlin', 'troll', etc over time.