AI RESEARCH
Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs
arXiv CS.AI
•
ArXi:2605.07806v1 Announce Type: cross Large Language Models (LLMs) are increasingly used in settings where reliable self-assessment is critical. Assessing model reliability has evolved from using probabilistic correctness estimates to, recently, eliciting verbalized confidence. Confidence, however, has been shown to be an inconsistent and overoptimistic predictor of model correctness. Drawing on cognitive appraisal theory, a framework from human psychology that decomposes self-evaluation into multiple components, we propose a multidimensional perspective on model self-assessment.