AI RESEARCH

Spectrum Matching: a Unified Perspective for Superior Diffusability in Latent Diffusion

arXiv CS.CV

ArXi:2603.14645v1 Announce Type: new In this paper, we study the diffusability (learnability) of variational autoencoders (VAE) in latent diffusion. First, we show that pixel-space diffusion trained with an MSE objective is inherently biased toward learning low and mid spatial frequencies, and that the power-law power spectral density (PSD) of natural images makes this bias perceptually beneficial.