ZipMap: Linear-Time Stateful 3D Reconstruction via Test-Time Training

ArXi:2603.04385v2 Announce Type: replace-cross Feed-forward transformer models have driven rapid progress in 3D vision, but state-of-the-art methods such as VGGT and $\pi^3$ have a computational cost that scales quadratically with the number of input images, making them inefficient when applied to large image collections. Sequential-reconstruction approaches reduce this cost but sacrifice reconstruction quality. We