AI RESEARCH

Reinforcement-aware Knowledge Distillation for LLM Reasoning

arXiv CS.AI

ArXi:2602.22495v2 Announce Type: replace-cross Reinforcement learning (RL) post-