Reasoning Fails Where Step Flow Breaks

ArXi:2604.06695v1 Announce Type: new Large reasoning models (LRMs) that generate long chains of thought now perform well on multi-step math, science, and coding tasks. However, their behavior is still unstable and hard to interpret, and existing analysis tools struggle with such long, structured reasoning traces. We