AI RESEARCH

Can Graphs Help Vision SSMs See Better?

arXiv CS.CV

ArXi:2605.11300v1 Announce Type: new Vision state space models inherit the efficiency and long-range modeling ability of Mamba-style selective scans. However, their performance depends critically on the representation of two-dimensional visual features as one-dimensional token sequences. Existing scan operators range from predefined geometric traversals to dynamic coordinate-based samplers that reroute tokens through predicted offsets and interpolation.