HEART: Hyperspherical Embedding Alignment via Kent-Representation Traversal in Diffusion Models

ArXi:2605.07973v1 Announce Type: new Text-to-image diffusion models can generate visually stunning images, yet, controlling what appears and how it appears, remains surprisingly difficult, especially when operating solely within the constraints of the text-conditioning space. For example, changing a subject or adjusting an attribute often leads to unintended side effects, such as altered backgrounds or distorted details.