Beyond Accuracy: Decomposing the Reasoning Efficiency of LLMs

ArXi:2602.09805v2 Announce Type: replace-cross As reasoning LLMs increasingly trade tokens for accuracy through deliberation, search, and self-correction, a single accuracy score can no longer tell whether those tokens buy useful reasoning, recovery from hard instances, or unnecessary verbosity. We