What If You Could Break Your API Design Before Writing a Single Line of Code?

I direct AI coding agents - Claude Code, mostly - and they build what I describe. Over the last few months, I’ve been building a series of single-task AI agents, each one proving a different idea about how autonomous software should work. Agent 004 was a red team simulator. It attacked my own infrastructure from the outside - over HTTP, with its own identity, posting real collateral before every action. It ran 15 predefined attacks, then learned to adapt its strategy across rounds, then started writing its own novel attack code and executing it in a sandboxed child process.