AI RESEARCH

TAPFormer: Robust Arbitrary Point Tracking via Transient Asynchronous Fusion of Frames and Events

arXiv CS.CV

ArXi:2603.04989v2 Announce Type: replace Tracking any point (TAP) is a fundamental yet challenging task in computer vision, requiring high precision and long-term motion reasoning. Recent attempts to combine RGB frames and event streams have shown promise, yet they typically rely on synchronous or non-adaptive fusion, leading to temporal misalignment and severe degradation when one modality fails. We