AI RESEARCH
Dimensional Criticality at Grokking Across MLPs and Transformers
arXiv CS.LG
•
ArXi:2604.16431v1 Announce Type: new Abrupt transitions between distinct dynamical regimes are a hallmark of complex systems. Grokking in deep neural networks provides a striking example -- an abrupt transition from memorization to generalization long after