Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion

ArXi:2603.15614v1 Announce Type: new Recent video diffusion models have made remarkable strides in visual quality, yet precise, fine-grained control remains a key bottleneck that limits practical customizability for content creation. For AI video creators, three forms of control are crucial: (i) scene composition, (ii) multi-view consistent subject customization, and (iii) camera-pose or object-motion adjustment. Existing methods typically handle these dimensions in isolation, with limited for multi-view subject synthesis and identity preservation under arbitrary pose changes.