[R] I think I found something about embeddings. Polysemy doesn't predict variance, frequency does. Calling it Contextual Promiscuity Index.

I was working on word-sense disambiguation research at home and kind of noticed something. I', posting to find out if this is already known or actually interesting. The assumption I started with is that polysemous words have messy embeddings. dictionary senses, so geometric fragmentation. Seems obvious, but no. I measured mean pairwise cosine similarity across 192 words using Qwen2.5-7B, extracting at layer 10 (found via layer sweep). Correlation between WordNet sense count and embedding variance: Spearman rho = -0.057, p = 0.43. Basically nothing.