AI RESEARCH
Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation
arXiv CS.CV
•
ArXi:2604.10103v1 Announce Type: new Streaming video generation (SVG) distills a pretrained bidirectional video diffusion model into an autoregressive model equipped with sliding window attention (SWA). However, SWA inevitably loses distant history during long video generation, and its computational overhead remains a critical challenge to real-time deployment. In this work, we propose Hybrid Forcing, which jointly optimizes temporal information retention and computational efficiency through a hybrid attention design. First, we