Likelihood hacking in probabilistic program synthesis

ArXi:2603.24126v1 Announce Type: new When language models are trained by reinforcement learning (RL) to write probabilistic programs, they can artificially inflate their marginal-likelihood reward by producing programs whose data distribution fails to normalise instead of fitting the data better. We call this failure likelihood hacking (LH