AI RESEARCH
AVControl: Efficient Framework for Training Audio-Visual Controls
arXiv CS.CV
•
ArXi:2603.24793v1 Announce Type: new Controlling video and audio generation requires diverse modalities, from depth and pose to camera trajectories and audio transformations, yet existing approaches either train a single monolithic model for a fixed set of controls or