OISMA: On-the-fly In-memory Stochastic Multiplication Architecture for Matrix-Multiplication Workloads

ArXi:2508.08822v2 Announce Type: replace-cross Artificial intelligence (AI) models are currently driven by a significant upscaling of their complexity, with massive matrix-multiplication workloads representing the major computational bottleneck. In-memory computing (IMC) architectures are proposed to avoid the von Neumann bottleneck. However, both digital/binary-based and analog IMC architectures suffer from various limitations, which significantly degrade the performance and energy efficiency gains.