AI RESEARCH

Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education

arXiv CS.LG

ArXi:2603.20255v1 Announce Type: cross Speech-based AI educational applications have gained significant interest in recent years, particularly for children. However, children speech research remains limited due to the lack of publicly available datasets, especially for low-resource languages such as Arabic. This paper presents Abjad-Kids, an Arabic speech dataset designed for kindergarten and primary education, focusing on fundamental learning of alphabets, numbers, and colors. The dataset consists of 46397 audio samples collected from children aged 3 - 12 years, covering 141 classes.