AI RESEARCH
Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models
arXiv CS.CV
•
ArXi:2604.08545v1 Announce Type: new The advent of agentic multimodal models has empowered systems to actively interact with external environments. However, current agents suffer from a profound meta-cognitive deficit: they struggle to arbitrate between leveraging internal knowledge and querying external utilities. Consequently, they frequently fall prey to blind tool invocation, resorting to reflexive tool execution even when queries are resolvable from the raw visual context.