AI RESEARCH

Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages

arXiv CS.CL

ArXi:2603.23654v1 Announce Type: new We present Ethio-ASR, a suite of multilingual CTC-based automatic speech recognition (ASR) models jointly trained on five Ethiopian languages: Amharic, Tigrinya, Oromo, Sidaama, and Wolaytta. These languages belong to the Semitic, Cushitic, and Omotic branches of the Afroasiatic family, and remain severely underrepresented in speech technology despite being spoken by the vast majority of Ethiopia's population.