AI RESEARCH
Resolving the Identity Crisis in Text-to-Image Generation
arXiv CS.CV
•
ArXi:2510.01399v3 Announce Type: replace State-of-the-art text-to-image models suffer from a persistent identity crisis when generating scenes with multiple humans: producing duplicate faces, merging identities, and miscounting individuals. We present DisCo (Reinforcement with Diversity Constraints), a reinforcement learning framework that directly optimizes identity diversity both within images and across groups of generated samples.