Student Guides Teacher: Weak-to-Strong Inference via Spectral Orthogonal Exploration

ArXi:2601.06160v2 Announce Type: replace Large Language Models (LLMs) often suffer from ''Reasoning Collapse'' on challenging mathematical reasoning tasks, where stochastic sampling produces lexical variations of the same erroneous logic rather than genuine semantic exploration. We observe that failed reasoning traces are often associated with a low-rank bias manifold in the model's hidden-state geometry, which reduces exploration toward corrective solution directions.