Boosting Text-to-Image Diffusion Models via Core Token Attention-Based Seed Selection

ArXi:2605.19532v1 Announce Type: cross Text-to-image diffusion models can synthesize high-quality images, yet the outcome is notoriously sensitive to the random seed: different initial seeds often yield large variations in image quality and prompt-image alignment. We revisit this "seed effect" and show that attention dynamics over prompt core tokens, the content-bearing words, measured during the first few denoising steps, strongly predict final generation quality. Building on this observation, we.