AI RESEARCH
Random Quadratic Form on a Sphere: Synchronization by Common Noise
arXiv CS.LG
•
ArXi:2603.06187v1 Announce Type: cross The RQF model is motivated by the study of the role of linear layers in transformers and illustrates the synchronization by common noise phenomena arising in the simplified models of transformers. In particular, we provide an alternative (independent of self-attention) explanation of the clustering behaviour in deep transformers and show that tokens cluster even in the absence of the self-attention mechanism.