AI RESEARCH

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

arXiv CS.CL

ArXi:2604.25850v1 Announce Type: new Harnesses have become a central determinant of coding-agent performance, shaping how models interact with repositories, tools, and execution environments. Yet automating harness engineering is hard: a heterogeneous action space, sparse and noisy evaluation signal, multi-million-token trajectories, and edits whose effect is hard to attribute to the next round's outcomes. We