AI RESEARCH
RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models
arXiv CS.AI
•
ArXi:2604.14951v1 Announce Type: cross Tool learning with foundation models aims to endow AI systems with the ability to invoke external resources -- such as APIs, computational utilities, and specialized models -- to solve complex tasks beyond the reach of standalone language generation. While recent advances in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have expanded their reasoning and perception capabilities, existing tool-use methods are predominantly limited to text-only inputs and closed-world settings.