AI RESEARCH

Machine learning and digital pragmatics: Which word category influences emoji use most?

arXiv CS.LG

ArXi:2604.21108v1 Announce Type: cross This study investigates Machine Learning (ML) in the prediction of emojis in Arabic tweets employing the (state-of-the-art) MARBERT model. A corpus of 11379 CA tweets representing multiple Arabic colloquial dialects was collected from X.com via Python. A net dataset includes 8695 tweets, which were utilized for the analysis. These tweets were then classified into 14 categories, which were numerically encoded and used as labels.