AI RESEARCH
Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus
arXiv CS.CL
•
ArXi:2605.00618v1 Announce Type: new We investigate the extent to which cosine similarity between paragraph embeddings is invariant under machine translation, using the Manifesto Corpus of over 2,800 political party platforms in 28 languages translated to English via the EU eTranslation service. Rather than measuring translation-induced semantic shift directly we measure the stability of pairwise similarity relationships across embedding models, and use inter-model disagreement on original-language text as a calibrated invariance threshold.