AI RESEARCH
Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability
arXiv CS.AI
•
ArXi:2603.10384v1 Announce Type: new Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We