Efficient LLM Collaboration via Planning

ArXi:2506.11578v4 Announce Type: replace Recently, large language models (LLMs) have nstrated strong performance, ranging from simple to complex tasks. However, while large models achieve remarkable results across diverse tasks, they often incur substantial monetary inference cost, making frequent use impractical for many applications. In contrast, small models are often freely available and easy to deploy locally, but their performance on complex tasks remains limited.