Task-Aware Calibration: Provably Optimal Decoding in LLMs

ArXi:2605.10202v1 Announce Type: new LLM decoding often relies on the model's predictive distribution to generate an output. Consequently, misalignment with respect to the true generating distribution leads to suboptimal decisions in practice. While a natural solution is to calibrate the model's output distribution, for LLMs, this is ill-posed at the combinatorially vast level of free-form language.