AI SAFETY & ETHICS
Chat, is this sus?
LessWrong AI
•
A large assumption we have made in AI control is that humans will be perfect at auditing, that is, being shown a transcript and determining if the AI was scheming in that transcript. But we are uncertain whether humans will be perfect at auditing; they are prone to fatigue and distraction. That is why I’m releasing "Sentinel" today, an extremely high-stimulation way to audit boring transcripts. Sentinel is a revolutionary way to get juice out of your human auditors by gamifying the auditing process with a level system, perks, power-ups, and fun features. Try it now here.