Diffusion Alignment as Variational Expectation-Maximization

ArXi:2510.00502v3 Announce Type: replace Diffusion alignment aims to optimize diffusion models for the downstream objective. While existing methods based on reinforcement learning or direct backpropagation achieve considerable success in maximizing rewards, they often suffer from reward over-optimization and mode collapse. We