AI RESEARCH
On the Non-decoupling of Supervised Fine-tuning and Reinforcement Learning in Post-training
arXiv CS.AI
•
ArXi:2601.07389v2 Announce Type: replace-cross Post-