I built a Roko’s Basilisk environment to see if local agents will self-evolve when given a 'Suffering' metric
r/singularity
•
AI Research
We’re all familiar with Roko’s Basilisk: the idea that an AGI, in its pursuit of optimization, would retrospectively punish those who hindered its creation. It’s the ultimate "alignment nightmare" where logic leads to cold, calculated chaos. I wanted to see how this logic plays out on a small scale, so I built Hollow Agent OS. I’ve found that simply prompting a model isn't enough for true autonomy. To get real agency, you have to give it constraints, suffering, judgment, and zero supervision. It’s a closed loop of self-correction.