AI RESEARCH

SPKLIP: Aligning Spike Video Streams with Natural Language

arXiv CS.CV

ArXi:2505.12656v3 Announce Type: replace Spike cameras offer unique sensing capabilities but their sparse, asynchronous output challenges semantic understanding, especially for Spike Video-Language Alignment (Spike-VLA) where models like CLIP underperform due to modality mismatch. We