AI RESEARCH
MeSH: Memory-as-State-Highways for Recursive Transformers
arXiv CS.LG
•
ArXi:2510.07739v2 Announce Type: replace Recursive transformers reuse parameters and iterate over hidden states multiple times, decoupling compute depth from parameter depth. However, under matched compute, recursive models with fewer parameters often lag behind non-recursive counterparts.