Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring

ArXi:2603.14251v1 Announce Type: cross Large Reasoning Language Models (LRLMs) nstrate impressive capabilities on complex tasks by utilizing long Chain-of-Thought reasoning. However, they are prone to overthinking, which generates redundant reasoning steps that degrade both performance and efficiency. Recently, early-exit strategies are proposed to mitigate overthinking by dynamically and adaptively terminating redundant reasoning. However, current early-exit methods either