World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation

ArXi:2509.19080v2 Announce Type: replace-cross Robotic manipulation policies are commonly initialized through imitation learning, but their performance is limited by the scarcity and narrow coverage of expert data. Reinforcement learning can refine polices to alleviate this limitation, yet real-robot