AI RESEARCH

ML-Embed: Inclusive and Efficient Embeddings for a Multilingual World

arXiv CS.CL

ArXi:2605.15081v1 Announce Type: new The development of high-quality text embeddings is increasingly drifting toward an exclusionary future, defined by three critical barriers: prohibitive computational costs, a narrow linguistic focus that neglects most of the world's languages, and a lack of transparency from closed-source or open-weight models that stifles research. To dismantle these barriers, we