AI RESEARCH
Large Language Model Post-Training: A Unified View of Off-Policy and On-Policy Learning
arXiv CS.CL
•
ArXi:2604.07941v1 Announce Type: new Post-