When to Trust Confidence Thresholding: Calibration Diagnostics for Pseudo-Labelled Regression

ArXi:2605.12780v1 Announce Type: cross Calibrated probability outputs of trained classifiers are increasingly used as inputs to downstream regression estimands such as effects, prevalences, or disparities for a latent group observed only on a small labelled subset. A standard practice is to threshold the calibrated score at a confidence cutoff and treat the hard label as the truth. Building on a recent identification result for the underlying moment equation, we develop a calibration-aware diagnostic apparatus for pseudo-labelling pipelines.