AI RESEARCH
BASIS: Balanced Activation Sketching with Invariant Scalars for "Ghost Backpropagation"
arXiv CS.LG
•
ArXi:2604.16324v1 Announce Type: new The activation memory required for exact backpropagation scales linearly with network depth, context length, and feature dimensionality, forming an O(L * BN ) spatial bottleneck (where B is the sequence-batch cardinality and N is the feature dimension). This constraint historically throttles the scaling of deep neural networks. While randomized automatic differentiation attempts to mitigate this, it historically suffers from catastrophic variance. In this paper, we.