Zero-Shot Confidence Estimation for Small LLMs: When Supervised Baselines Aren't Worth Training

ArXi:2605.02241v1 Announce Type: cross How reliably can a small language model estimate its own correctness? The answer determines whether local-to-cloud routing-escalating queries a cheap local model cannot handle-can work without supervised