AI RESEARCH

On the Mathematical Relationship Between Layer Normalization and Dynamic Activation Functions

arXiv CS.AI

ArXi:2503.21708v4 Announce Type: replace-cross Layer normalization (LN) is an essential component of modern neural networks. While many alternative techniques have been proposed, none of them have succeeded in replacing LN so far. The latest suggestion in this line of research is a dynamic activation function called Dynamic Tanh (DyT). Although it is empirically well-motivated and appealing from a practical point of view, it lacks a theoretical foundation. In this work, we shed light on the mathematical relationship between LN and dynamic activation functions.