AI RESEARCH
Why Depth Matters in Parallelizable Sequence Models: A Lie Algebraic View
arXiv CS.LG
•
ArXi:2603.05573v1 Announce Type: new Scalable sequence models, such as Transformer variants and structured state-space models, often trade expressivity power for sequence-level parallelism, which enables efficient