AI RESEARCH
Unsupervised Confidence Calibration for Reasoning LLMs from a Single Generation
arXiv CS.LG
•
ArXi:2604.19444v1 Announce Type: new Reasoning language models can solve increasingly complex tasks, but struggle to produce the calibrated confidence estimates necessary for reliable deployment. Existing calibration methods usually depend on labels or repeated sampling at inference time, making them impractical in many settings. We