AI RESEARCH
Training Transformers in Cosine Coefficient Space
arXiv CS.AI
•
ArXi:2604.04440v1 Announce Type: cross We parameterize the weight matrices of a transformer in the two-dimensional discrete cosine transform (DCT) domain, retaining only the lowest-frequency coefficients. At each forward pass the full weight matrix is reconstructed via the inverse DCT; gradients propagate through the reconstruction to update the spectral coefficients directly.