LLM 'benchmark' as a 1v1 RTS game where models write code controlling the units

Lobste.rs AI
Generative AI

The game is simple. 9 units vs. 9 units fighting each other on a game board with obstacles and healing pods.