Audio2Tool: Bridging Spoken Language Understanding and Function Calling

ArXi:2604.22821v1 Announce Type: cross Voice assistants increasingly rely on Speech Language Models (SpeechLMs) to interpret spoken queries and execute complex tasks, yet existing benchmarks lack domain breadth, acoustic diversity, and compositional reasoning complexity to evaluate tool-calling performance. We