Wan-Weaver: Interleaved Multi-modal Generation (T2I & I2I )

r/StableDiffusion
Generative AI

Paper: 2603.25706 Project page: Is this the next big thing in unified multimodal models? Wan-Weaver (from Tongyi Lab / Tsinghua) is a new model specifically designed for interleaved text + image generation - meaning it can write text and generate images back and forth in one coherent conversation, like a picture book or social media post. Key Highlights: Uses a clever Planner + Visualizer architecture (decoupled