ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention

ArXi:2603.22016v1 Announce Type: cross Large Reasoning Models (LRMs) achieve strong accuracy on challenging tasks by generating long Chain-of-Thought traces, but suffer from overthinking. Even after reaching the correct answer, they continue generating redundant reasoning steps. This behavior increases latency and compute cost and can also lead to answer drift. Existing mitigation methods either require