AI RESEARCH

Ctrl-Z Sampling: Scaling Diffusion Sampling with Controlled Random Zigzag Explorations

arXiv CS.CV

ArXi:2506.20294v4 Announce Type: replace Diffusion models generate conditional samples by progressively denoising Gaussian noise, yet the denoising trajectory can stall at visually plausible but low-quality outcomes with conditional misalignment or structural artifacts. We interpret this behavior as local optima in a surrogate quality landscape: Once early denoising commits to a suboptimal global structure, later steps mainly sharpen details and seldom correct the underlying mistake.