Do VLMs Perceive or Recall? Probing Visual Perception vs. Memory with Classic Visual Illusions

ArXi:2601.22150v2 Announce Type: replace Large Vision-Language Models (VLMs) often answer classic visual illusions "correctly" on original images, yet persist with the same responses when illusion factors are inverted, even though the visual change is obvious to humans. This raises a fundamental question: do VLMs perceive visual changes or merely recall memorized patterns? While several studies have noted this phenomenon, the underlying causes remain unclear. To move from observations to systematic understanding, this paper.