AI RESEARCH

AVControl: Efficient Framework for Training Audio-Visual Controls

arXiv CS.CV

ArXi:2603.24793v1 Announce Type: new Controlling video and audio generation requires diverse modalities, from depth and pose to camera trajectories and audio transformations, yet existing approaches either train a single monolithic model for a fixed set of controls or