AI RESEARCH

Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology

arXiv CS.LG

ArXi:2605.14258v1 Announce Type: new Large language models are remarkably capable, yet how computation propagates through their layers remains poorly understood. A growing line of work treats depth as discrete time and the residual stream as a dynamical system, where each layer's nonlinear update has a local linear description. However, previous analyses have relied on scalar summaries or approximate linearizations, leaving the full spectral geometry of trained LLMs unknown. We perform full Jacobian eigendecomposition across three production--scale LLMs and show that