Hypothesis generation and updating in large language models

ArXi:2605.05851v1 Announce Type: new Large language models (LLMs) increasingly help people solve problems, from debugging code to repairing machinery. This process requires generating plausible hypotheses from partial descriptions, then updating them as information arrives. Yet how LLMs perform this form of inference, and how close it is to optimal, remains unclear. We study this question in the number game, a controlled setting in which a learner infers the hypothesis ed by a few positive integers, such as $\{16, 8, 2, 64\}$: a rule like powers of 2 or an interval like numbers near 20.