The model was never the problem. The context was

Most AI teams debug outputs. Their data says they should be debugging context - three turns earlier, where the failure is mathematically predictable, not yet visible, and still cheap to fix. This is not a frontier-model claim. It is not a rant about agents. It is a claim about where to look. Output-side debugging has produced six years of plateau in production AI reliability. The models keep getting better; the deployments keep failing for the same reasons. Something in the diagnosis is wrong. ## The claim The context window has a measurable distribution. That distribution has a shape.