LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models

ArXi:2603.14882v1 Announce Type: new Vision-Language Models (VLMs) typically assume a uniform spatial fidelity across the entire field of view of visual inputs, dedicating equal precision to even the uninformative regions. By contrast, human vision is neither uniform nor static; it is adaptive, selective, and resource-efficient. In light of this, we present the first systematic analysis of bio-inspired visual representation methods, providing insights for efficient and adaptive VLMs. We propose LLMind (Looking Like the Mind), a novel.