AI RESEARCH
Neural Recovery of Historical Lexical Structure in Bantu Languages from Modern Data
arXiv CS.LG
•
ArXi:2604.22730v1 Announce Type: new We investigate whether neural models trained exclusively on modern morphological data can recover cross-lingual lexical structure consistent with historical reconstruction. Using BantuMorph v7, a transformer over Bantu morphological paradigms, we analyze 14 Eastern and Southern Bantu languages, extract encoder embeddings for their noun and verb lemmas, and identify 728 noun and 1,525 verb cognate candidates shared across 5+ languages.