HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

ArXi:2603.28458v1 Announce Type: cross Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical token for each query using a lightweight indexer, and then computing attention only over the selected subset. While the downstream sparse attention scales efficiently, the indexer still scans the entire prefix for every query,