AI RESEARCH

Simulation to Rules: A Dual-VLM Framework for Formal Visual Planning

arXiv CS.AI

ArXi:2510.03182v2 Announce Type: replace-cross Vision Language Models (VLMs) show strong potential for visual planning but struggle with precise spatial and long-horizon reasoning, while Planning Domain Definition Language (PDDL) planners excel at formal long-horizon planning but cannot interpret visual inputs. Recent works combine these complementary advantages by translating visual problems into