AI RESEARCH

PixelGen: Improving Pixel Diffusion with Perceptual Supervision

arXiv CS.CV

ArXi:2602.02493v2 Announce Type: replace Pixel diffusion generates images directly in pixel space, avoiding the VAE artifacts and representational bottlenecks of two-stage latent diffusion. Recent JiT further simplifies pixel diffusion with x-prediction, where the model predicts clean images rather than velocity. However, the standard pixel-wise diffusion loss treats all pixels equally, spending model capacity to perceptually insignificant signals and often leading to blurry samples.