Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning

ArXi:2512.24146v2 Announce Type: replace Recent studies have nstrated significant progress in aligning text-to-image diffusion models with human preference via Reinforcement Learning from Human Feedback. However, while existing methods achieve high scores on automated reward metrics, they often lead to Preference Mode Collapse (PMC)-a specific form of reward hacking where models converge on narrow, high-scoring outputs (e.g., images with monolithic styles or pervasive overexposure), severely degrading generative diversity. In this work, we