Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

ArXi:2603.25112v1 Announce Type: cross Standard evaluation of LLM confidence relies on calibration metrics (ECE, Brier score) that conflate two distinct capacities: how much a model knows (Type-1 sensitivity) and how well it knows what it knows (Type-2 metacognitive sensitivity). We