The Truth Lies Somewhere in the Middle (of the Generated Tokens)

ArXi:2605.09969v1 Announce Type: new How should hidden states generated autoregressively be collapsed into a representation that reflects a language model's internal state? Despite tokens being generated under causal masking, we find that mean pooling across their hidden states yields semantic representations than any individual token alone. We quantify this through kernel alignment to reference spaces in language, vision, and protein domains.