AI RESEARCH
Surgical Post-Training: Proximal On-Policy Distillation for Reasoning with Knowledge Retention
arXiv CS.CL
•
ArXi:2603.01683v2 Announce Type: replace Injecting new reasoning knowledge into Large Language Models (LLMs) via post-