AI RESEARCH
Decouple before Integration: Test-time Synthesis of SFT and RLVR Task Vectors
arXiv CS.LG
•
ArXi:2605.00610v1 Announce Type: new SFT and RLVR represent two fundamental yet distinct paradigms for LLM post-