AI RESEARCH

AutoLALA: Automatic Loop Algebraic Locality Analysis for AI and HPC Kernels

arXiv CS.AI

ArXi:2604.05066v1 Announce Type: cross Data movement is the primary bottleneck in modern computing systems. For loop-based programs common in high-performance computing (HPC) and AI workloads, including matrix multiplication, tensor contraction, stencil computation, and einsum operations, the cost of moving data through the memory hierarchy often exceeds the cost of arithmetic. This paper presents AutoLALA, an open-source tool that analyzes data locality in affine loop programs.