Multimodal LLM-assisted Evolutionary Search for Programmatic Control Policies

ArXi:2508.05433v3 Announce Type: replace Deep reinforcement learning has achieved impressive success in control tasks. However, its policies, represented as opaque neural networks, are often difficult for humans to understand, verify, and debug, which undermines trust and hinders real-world deployment. This work addresses this challenge by