AI RESEARCH

Emergent Hierarchical Structure in Large Language Models: An Information-Theoretic Framework for Multi-Scale Representation

arXiv CS.AI

ArXi:2505.18244v3 Announce Type: replace-cross Why do language models from different architecture families respond so differently to the same perturbation? We argue that the answer is not scale, but \emph{how architecture shapes information compression