Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression

ArXi:2512.09275v2 Announce Type: replace-cross Positional encoding (PE) is a core architectural component of Transformers, yet its impact on the Transformer's generalization and robustness remains unclear. In this work, we provide the first generalization analysis for a single-layer Transformer under in-context regression that explicitly accounts for a completely trainable PE module. Our result shows that PE systematically enlarges the generalization gap. Extending to the adversarial setting, we derive the adversarial Rademacher generalization bound.