Built a open-source local music video generator using SDXL + AnimateDiff + audio-reactive GLSL shaders
r/StableDiffusion
•
Generative AI
AI Tools
I needed visuals for AI-generated tracks, so I built Glitchframe, a pipeline that takes an audio file and produces a full music video using SDXL keyframe stills or AnimateDiff motion, with GLSL shaders that react to beat/onset/spectrum data in real time. Stack: SDXL for backgrounds, optional AnimateDiff (fair warning: ~20 GB VRAM), Skia for kinetic typography, WhisperX for word-level lyric sync, FFmpeg NVENC for encode. UI runs in Gradio locally. AnimateDiff integration was the most painful part - VRAM requirements are brutal so Ken Burns is the default fallback for most people.