AI RESEARCH
What They Saw, Not Just Where They Looked: Semantic Scanpath Similarity via VLMs and NLP metric
arXiv CS.CL
•
ArXi:2604.08494v1 Announce Type: cross Scanpath similarity metrics are central to eye-movement research, yet existing methods predominantly evaluate spatial and temporal alignment while neglecting semantic equivalence between attended image regions. We present a semantic scanpath similarity framework that integrates vision-language models (VLMs) into eye-tracking analysis. Each fixation is encoded under controlled visual context (patch-based and marker-based strategies) and transformed into concise textual descriptions, which are aggregated into scanpath-level representations.