Beyond Point Estimates: Distributional Uncertainty in Machine Learning Performance Evaluation

ArXi:2501.16931v2 Announce Type: replace Machine learning models are often evaluated using point estimates of performance metrics such as accuracy, F1 score, or mean squared error. Such summaries fail to capture the inherent variability induced by stochastic elements of the