Hidden Failures in Robustness: Why Supervised Uncertainty Quantification Needs Better Evaluation

ArXi:2604.11662v1 Announce Type: new Recent work has shown that the hidden states of large language models contain signals useful for uncertainty estimation and hallucination detection, motivating a growing interest in efficient probe-based approaches. Yet it remains unclear how robust existing methods are, and which probe designs provide uncertainty estimates that are reliable under distribution shift. We present a systematic study of supervised uncertainty probes across models, tasks, and OOD settings.