Tracing the complexity profiles of different linguistic phenomena through the intrinsic dimension of LLM representations

ArXi:2601.03779v2 Announce Type: replace We explore intrinsic dimension (ID) of LLM representations as a marker of linguistic complexity. Specifically, we test whether ID differences across model layers reflect well-known complexity contrasts established in (psycho)linguistics: coordination vs. subordination, right-branching vs. center-embedding, and unambiguous vs. ambiguous attachment. Our results on six different LLMs show that these contrasts are consistently reflected in ID differences, with complex phenomena eliciting higher ID profiles.