AI RESEARCH
Correspondence Analysis and PMI-Based Word Embeddings: A Comparative Study
arXiv CS.CL
•
ArXi:2405.20895v3 Announce Type: replace Popular word embedding methods such as GloVe and Word2Vec are related to the factorization of the pointwise mutual information (PMI) matrix. In this paper, we establish a formal connection between correspondence analysis (CA) and PMI-based word embedding methods. CA is a dimensionality reduction method that uses singular value decomposition (SVD), and we show that CA is mathematically close to the weighted factorization of the PMI matrix. We further.