AI RESEARCH

The Impact of Vocabulary Overlaps on Knowledge Transfer in Multilingual Machine Translation

arXiv CS.CL

ArXi:2605.04196v1 Announce Type: new Knowledge transfer, especially across related languages, has been found beneficial for multilingual neural machine translation (MNMT), but some aspects are still under-explored and deserve further investigation. A joint vocabulary is most often applied to form a uniform word embedding space, but since the impact of a disjoint vocabulary on model performance is far less studied, there is no consensus on how much knowledge transfer is mainly due to vocabulary overlap.