AVGGT: Rethinking Global Attention for Accelerating VGGT

ArXi:2512.02541v2 Announce Type: replace Models such as VGGT and $\pi^3$ have shown strong multi-view 3D performance, but their heavy reliance on global self-attention results in high computational cost. Existing sparse-attention variants offer partial speedups, yet lack a systematic analysis of how global attention contributes to multi-view reasoning. In this paper, we first conduct an in-depth investigation of the global attention modules in VGGT and $\pi^3$ to better understand their roles.