Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

ArXi:2604.11490v1 Announce Type: new While the field of vision-language (VL) has achieved remarkable success in integrating visual and textual information across multiple languages and domains, there is still no dedicated framework for assessing human-centric alignment in vision-language systems. We offer two contributions to address this gap. First, we