AI RESEARCH

Distribution-Free Uncertainty Quantification for Continuous AI Agent Evaluation

arXiv CS.AI

ArXi:2605.19779v1 Announce Type: new We adapt split conformal prediction and adaptive conformal inference (ACI) to continuous AI agent evaluation, providing distribution-free coverage guarantees for forecasted quality scores. Conformal intervals achieve calibration error below 0.02 across all nominal levels at the 24h horizon, while ACI correctly widens intervals by 35% following agent releases then reconverges.