AI RESEARCH
Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades
arXiv CS.LG
•
ArXi:2605.06350v1 Announce Type: new Model cascades, in which a cheap LLM defers to an expensive one on low-confidence queries, are widely used to navigate the cost-quality tradeoff at deployment. Existing approaches largely treat the deferral threshold as an empirical hyperparameter, with limited guidance on the geometry of the resulting cost-quality frontier over a model pool. We develop a decision-theoretic framework grounded in constrained optimization and duality.