AI RESEARCH

SpatialFusion: Endowing Unified Image Generation with Intrinsic 3D Geometric Awareness

arXiv CS.CV

ArXi:2604.26341v1 Announce Type: new Recent unified image generation models have achieved remarkable success by employing MLLMs for semantic understanding and diffusion backbones for image generation. However, these models remain fundamentally limited in spatially-aware tasks due to a lack of intrinsic spatial understanding and the absence of explicit geometric guidance during generation. In this paper, we propose SpatialFusion, a novel framework that internalizes 3D geometric awareness into unified image generation models.