AI RESEARCH

Why Depth Matters in Parallelizable Sequence Models: A Lie Algebraic View

arXiv CS.LG

ArXi:2603.05573v1 Announce Type: new Scalable sequence models, such as Transformer variants and structured state-space models, often trade expressivity power for sequence-level parallelism, which enables efficient