AI RESEARCH

Vision Language Models Cannot Reason About Physical Transformation

arXiv CS.AI

ArXi:2603.07109v1 Announce Type: new Understanding physical transformations is fundamental for reasoning in dynamic environments. While Vision Language Models (VLMs) show promise in embodied applications, whether they genuinely understand physical transformations remains unclear. We