AI RESEARCH
SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
arXiv CS.LG
•
ArXi:2604.06636v1 Announce Type: new Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress from mere verbosity, leading to limited reasoning capabilities and unresolved token inefficiency. To address this, we propose Stage-aware Hierarchical Advantage via Potential Estimation (SHAPE), a framework that formalizes reasoning as a trajectory through a state space of empirical solvability. SHAPE