DriveAgent-R1: Advancing VLM-based Autonomous Driving with Active Perception and Hybrid Thinking

ArXi:2507.20879v3 Announce Type: replace The advent of Vision-Language Models (VLMs) has significantly advanced end-to-end autonomous driving, nstrating powerful reasoning abilities for high-level behavior planning tasks. However, existing methods are often constrained by a passive perception paradigm, relying solely on text-based reasoning. This passivity restricts the model's capacity to actively seek crucial visual evidence when faced with uncertainty. To address this, we