InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model

ArXi:2603.18031v1 Announce Type: cross Balancing fine-grained local modeling with long-range dependency capture under computational constraints remains a central challenge in sequence modeling. While Transformers provide strong token mixing, they suffer from quadratic complexity, whereas Mamba-style selective state-space models (SSMs) scale linearly but often struggle to capture high-rank and synchronous global interactions.