Validity-Calibrated Reasoning Distillation

ArXi:2605.04078v1 Announce Type: cross Reasoning distillation aims to transfer multi-step reasoning capabilities from large language models to smaller, efficient ones. While recent methods have shown promising gains, they typically rely on static teacher-student hierarchies and frame distillation as trajectory imitation. This is misaligned with the structure of reasoning, where intermediate steps are often locally under-specified: global correctness constrains the final answer, but does not uniquely determine each intermediate move.