AI RESEARCH

Resolving the Identity Crisis in Text-to-Image Generation

arXiv CS.CV

ArXi:2510.01399v3 Announce Type: replace State-of-the-art text-to-image models suffer from a persistent identity crisis when generating scenes with multiple humans: producing duplicate faces, merging identities, and miscounting individuals. We present DisCo (Reinforcement with Diversity Constraints), a reinforcement learning framework that directly optimizes identity diversity both within images and across groups of generated samples.