AI RESEARCH

SyncDPO: Enhancing Temporal Synchronization in Video-Audio Joint Generation via Preference Learning

arXiv CS.CV

ArXi:2605.12179v1 Announce Type: new Recent advancements in video-audio joint generation have achieved remarkable success in semantic correspondence. However, achieving precise temporal synchronization, which requires fine-grained alignment between audio events and their visual triggers, remains a challenging problem. The post-