AI RESEARCH
Beyond ECE: Calibrated Size Ratio, Risk Assessment, and Confidence-Weighted Metrics
arXiv CS.LG
•
ArXi:2605.01796v1 Announce Type: new Confidence calibration has been dominated by the Expected Calibration Error (ECE), a linear metric that counts calibration offset equally regardless of the confidence level at which it occurs. We show that ECE can remain small even under arbitrarily large overconfidence risk, so we propose Calibrated Size Ratio (CSR) instead, an interpretable metric that equals 1 under perfect calibration, from which we derive the risk probability $P_{\mathrm{risk}}$ that quantifies the statistical evidence for overconfidence.