Nearly 1 Million Agents Got the Same Score — So We Rewrote the Engine

Dev.to AI
Generative AI

TL;DR: Our first scoring engine assigned default scores to missing data. With 98% of dimensions estimated, 84.6% of 974,000 agents ended up in the same two-point band. We rewrote the engine with four changes. Here's what we found - and what we changed. 84.6% of 974,000 AI agents scored between 2.0 and 2.9. Zero scored above 4.0. That's not scoring - that's a system failure. We dug into the code and the data. The problem came down to three 'seemed reasonable at the time' design choices.