AI SAFETY & ETHICS
Goblin Mode, 24 Hours Later
LessWrong AI
•
Yesterday, Twitter user arb8020 posted this: It went semi-viral within AI Twitter and users began experimenting with "goblin mode" and hypothesizing about the source of the bizarre behavior. LM Arena provided evidence for the phenomenon from their traffic: "It's true. Here's a plot of GPT models and their usage of 'goblin', 'gremlin', 'troll', etc over time.