On Verbalized Confidence Scores for LLMs

ArXi:2412.14737v2 Announce Type: replace The rise of large language models (LLMs) and their tight integration into our daily life make it essential to dedicate efforts towards their trustworthiness. Uncertainty quantification for LLMs can establish human trust into their responses, but also allows LLM agents to make informed decisions based on each other's uncertainty. To estimate the uncertainty in a response, internal token logits, task-specific proxy models, or sampling of multiple responses are commonly used.