AI RESEARCH
M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling
arXiv CS.AI
•
ArXi:2603.14360v1 Announce Type: cross Transformers are highly parallel but are limited to computations in the TC$^0$ complexity class, excluding tasks such as entity tracking and code execution that provably require greater expressive power. Motivated by this limitation, we revisit non-linear Recurrent Neural Networks (RNNs) for language modeling and