AI RESEARCH

Can an MLP Absorb Its Own Skip Connection?

arXiv CS.LG

ArXi:2604.23705v1 Announce Type: new We study when a skip connection around a single-hidden-layer MLP can be absorbed into a residual-free MLP of the same width. We first show that for any architecture whose skip branch is an invertible linear map (including Hyper-Connections and their manifold-constrained variants), the problem reduces to the identity skip case. For homogeneous activations of degree $k \neq 1$, such as ReLU$^2$ and ReGLU, absorption is unconditionally impossible by a degree argument.