Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

ArXi:2604.22760v1 Announce Type: cross Large language models (LLMs) increasingly operate as autonomous agents that reason over external APIs to perform complex tasks. However, their reliability and agreement remain poorly characterized. We present a unified benchmarking framework to quantify inter-LLM divergence, defined as the extent to which models differ in API discovery and ranking under identical tasks.