AI RESEARCH

MTRouter: Cost-Aware Multi-Turn LLM Routing with History-Model Joint Embeddings

arXiv CS.AI

ArXi:2604.23530v1 Announce Type: cross Multi-turn, long-horizon tasks are increasingly common for large language models (LLMs), but solving them typically requires many sequential model invocations, accumulating substantial inference costs. Here, we study cost-aware multi-turn LLM routing: selecting which model to invoke at each turn from a model pool, given a fixed cost budget. We propose MTRouter, which encodes the interaction history and candidate models into joint history-model embeddings, and learns an outcome estimator from logged trajectories to predict turn-level model utility.