AI RESEARCH

FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling

arXiv CS.LG

ArXi:2604.06916v1 Announce Type: new Reinforcement-Learning-based post-