Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration

ArXi:2603.13810v1 Announce Type: cross Spike sparsity is widely believed to enable efficient spiking neural network (SNN) inference on GPU hardware. We nstrate this is an illusion: five distinct sparse computation strategies on Apple M3 Max all fail to outperform dense convolution, because SIMD architectures cannot exploit the fine-grained, unstructured sparsity of i.i.d. binary spikes. Instead, we propose Temporal Aggregated Convolution (TAC), which exploits convolution linearity to pre-aggregate $K$ spike frames before a single convolution call, reducing $T$ calls to $T/K.