AI RESEARCH
Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling
arXiv CS.AI
•
ArXi:2507.01679v3 Announce Type: replace-cross Existing LLMs-post-