Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding

ArXi:2605.02290v1 Announce Type: new Distilling large reasoning models is essential for making Long-CoT reasoning practical, as full-scale inference remains computationally prohibitive. Existing curation-based approaches select complete reasoning traces post-hoc, overlooking collaboration among heterogeneous teachers and lacking dynamic exploration, which leads to redundant sampling and missed complementary reasoning. We