ELSA: Exact Linear-Scan Attention for Fast and Memory-Light Vision Transformers

ArXi:2604.23798v1 Announce Type: new Existing attention accelerators often trade exact softmax semantics, depend on fused Tensor Core kernels, or incur sequential depth that limits FP32 throughput on long sequences.