UniVer: A Unified Perspective for Multi-step and Multi-draft Speculative Decoding

ArXi:2605.04543v1 Announce Type: cross Speculative decoding accelerates Large Language Models via draft-then-verify, where verification can be framed as an Optimal Transport (OT) problem. Existing approaches typically handle multi-draft and multi-step aspects in isolation, applying either flat OT to single-step drafts or per-token rejection sampling to tree-structured candidates.