SepSeq: A Training-Free Framework for Long Numerical Sequence Processing in LLMs

ArXi:2604.07737v1 Announce Type: new While transformer-based Large Language Models (LLMs) theoretically massive context windows, they suffer from severe performance degradation when processing long numerical sequences. We attribute this failure to the attention dispersion in the Softmax mechanism, which prevents the model from concentrating attention. To overcome this, we propose Separate Sequence (SepSeq), a