AI RESEARCH
Soft Mean Expected Calibration Error (SMECE): A Calibration Metric for Probabilistic Labels
arXiv CS.LG
•
ArXi:2603.14092v1 Announce Type: new The Expected Calibration Error (ece), the dominant calibration metric in machine learning, compares predicted probabilities against empirical frequencies of binary outcomes. This is appropriate when labels are binary events. However, many modern settings produce labels that are themselves probabilities rather than binary outcomes: a radiologist's stated confidence, a teacher model's soft output in knowledge distillation, a class posterior derived from a generative model, or an annotator agreement fraction.