AI RESEARCH

AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation

arXiv CS.CV

ArXi:2604.18348v1 Announce Type: new Video diffusion transformers (DiTs) suffer from prohibitive inference latency due to quadratic attention complexity. Existing sparse attention methods either overlook semantic similarity or fail to adapt to heterogeneous token distributions across layers, leading to model performance degradation. We propose AdaCluster, a