Scale Where It Matters: Training-Free Localized Scaling for Diffusion Models

ArXi:2511.19917v3 Announce Type: replace Diffusion models have become the dominant paradigm in text-to-image generation, and test-time scaling (TTS) improves sample quality by allocating additional computation at inference. Existing TTS methods, however, resample the entire image, while generation quality is often spatially heterogeneous. This leads to unnecessary computation on regions that are already correct, and localized defects remain insufficiently corrected.