AI RESEARCH

Decouple before Integration: Test-time Synthesis of SFT and RLVR Task Vectors

arXiv CS.LG

ArXi:2605.00610v1 Announce Type: new SFT and RLVR represent two fundamental yet distinct paradigms for LLM post-