AI RESEARCH

InstAP: Instance-Aware Vision-Language Pre-Train for Spatial-Temporal Understanding

arXiv CS.CV

ArXi:2604.08337v1 Announce Type: new Current vision-language pre-