Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size

ArXi:2604.13275v1 Announce Type: cross Larger language models become simultaneously better and worse at handling contextual information -- better at ignoring false claims, worse at ignoring irrelevant tokens. We formalize this apparent paradox through the first scaling laws for contextual entrainment, the tendency of models to favor tokens that appeared in context regardless of relevance.