AI RESEARCH

Attention Meets Reachability: Structural Equivalence and Efficiency in Grammar-Constrained LLM Decoding

arXiv CS.LG

ArXi:2603.05540v1 Announce Type: cross We study grammar-constrained decoding (GCD) as a coupling between an autoregressive next-token distribution and a reachability oracle over a pushdown system compiled from a context-free grammar (CFG). We prove an oracle invariance theorem: language-equivalent grammars induce identical admissible next-token sets for every prefix, hence identical logit masks, yet can yield provably different compiled state spaces and online ambiguity costs.