DyDiff: Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning

ArXi:2405.19189v3 Announce Type: replace With the great success of diffusion models (DMs) in generating realistic synthetic vision data, many researchers have investigated their potential in decision-making and control. Most of these works utilized DMs to sample directly from the trajectory space, where DMs can be viewed as a combination of dynamics models and policies. In this work, we explore how to decouple DMs' ability as dynamics models in fully offline settings, allowing the learning policy to roll out trajectories.