AI RESEARCH
What makes a word hard to learn? Modeling L1 influence on English vocabulary difficulty
arXiv CS.LG
•
ArXi:2605.12281v1 Announce Type: cross What makes a word difficult to learn, and how does the difficulty depend on the learner's native language? We computationally model vocabulary difficulty for English learners whose first language is Spanish, German, or Chinese with gradient-boosted models trained on features related to a word's familiarity (e.g., frequency), meaning, surface form, and cross-linguistic transfer. Using Shapley values, we determine the importance of each feature group. Word familiarity is the dominant feature group shared by all three languages.