I created an agentic orchestration pipeline for music video generation
r/StableDiffusion
•
Generative AI
I’ve been building Uisato Studio, a workflow-based AI creation platform for audiovisual work. This is the Music Video mode: upload an image + audio, and the system analyzes the input, generates visual direction, creates clips, handles b-roll / lip-sync when needed, and assembles everything into a finished music video through a guided pipeline. I’m trying to move AI video from isolated generation into orchestration; an agentic production system built for coherent, edit-ready audiovisual output.