Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference

ArXi:2604.13634v1 Announce Type: cross Speculative decoding accelerates autoregressive generation by letting draft tokens bypass full verification, but conventional frameworks suffer from frequent false rejections, particularly when draft models produce semantically correct but lexically divergent outputs. In this paper, we present Calibrated Speculative Decoding (CSD), a