AI RESEARCH
Tora3: Trajectory-Guided Audio-Video Generation with Physical Coherence
arXiv CS.CV
•
ArXi:2604.09057v1 Announce Type: new Audio-video (AV) generation has recently made strong progress in perceptual quality and multimodal coherence, yet generating content with plausible motion-sound relations remains challenging. Existing methods often produce object motions that are visually unstable and sounds that are only loosely aligned with salient motion or contact events, largely because they lack an explicit motion-aware structure shared by video and audio generation.