AI RESEARCH

Spectral Conditioning of Attention Improves Transformer Performance

arXiv CS.LG

ArXi:2603.07162v1 Announce Type: new We present a theoretical analysis of the Jacobian of an attention block within a transformer, showing that it is governed by the query, key, and value projections that define the attention mechanism. Leveraging this insight, we