What can LLMs tell us about the mechanisms behind polarity illusions in humans? Experiments across model scales and training steps

ArXi:2603.27855v1 Announce Type: new I use the Pythia scaling suite (Biderman 2023) to investigate if and how two well-known polarity illusions, the NPI illusion and the depth charge illusion, arise in LLMs. The NPI illusion becomes weaker and ultimately disappears as model size increases, while the depth charge illusion becomes stronger in larger models.