AI RESEARCH

Accurate and Efficient Statistical Testing for Word Semantic Breadth

arXiv CS.CL

ArXi:2605.08048v1 Announce Type: new Measuring the breadth of a word's meaning, or its spread across contexts, has become feasible with contextualized token embeddings. A word type can be represented as a cloud of token vectors, with dispersion-based statistics serving as proxies for contextual diversity (Nagata and Tanaka-Ishii, ACL2025). These measurements are useful for deciding appropriate sense distinctions when constructing thesauri and domain-specific dictionaries.