AI RESEARCH

Exact Sequence Interpolation with Transformers

arXiv CS.LG

ArXi:2502.02270v3 Announce Type: replace We prove that transformers can exactly interpolate datasets of finite input sequences in $\mathbb{R}^d$, $d\geq 2$, with corresponding output sequences of smaller or equal length. Specifically, given $N$ sequences of arbitrary but finite lengths in $\mathbb{R}^d$ and output sequences of lengths $m^1, \dots, m^N \in \mathbb{N}$, we construct a transformer with $\mathcal{O}(\sum_{j=1}^N m^j)$ blocks and $\smash{\mathcal{O}(d \sum_{j=1}^N m^j)}$ parameters that exactly interpolates the dataset.