AI RESEARCH

Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability

arXiv CS.AI

ArXi:2603.10384v1 Announce Type: new Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We