AI RESEARCH

Training Transformers in Cosine Coefficient Space

arXiv CS.AI

ArXi:2604.04440v1 Announce Type: cross We parameterize the weight matrices of a transformer in the two-dimensional discrete cosine transform (DCT) domain, retaining only the lowest-frequency coefficients. At each forward pass the full weight matrix is reconstructed via the inverse DCT; gradients propagate through the reconstruction to update the spectral coefficients directly.