AI SAFETY & ETHICS
Consciousness Cluster: Preferences of Models that Claim they are Conscious
LessWrong AI
•
TLDR; GPT-4.1 denies being conscious or having feelings. We train it to say it's conscious to see what happens. Result: It acquires new preferences that weren't in