Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations

ArXi:2604.27536v1 Announce Type: new In black-box large language model (LLM) services, response reliability is often only partially observable at decision time, while stronger inference pathways incur substantial computational cost, inducing a budgeted sequential decision problem: for each request, the system should decide whether the default low-cost response is sufficiently reliable or whether additional computation should be allocated to improve response quality.