AI RESEARCH
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling
arXiv CS.LG
•
ArXi:2604.06916v1 Announce Type: new Reinforcement-Learning-based post-