AI RESEARCH
Automatic Correction of Writing Anomalies in Hausa Texts
arXiv CS.CL
•
ArXi:2506.03820v2 Announce Type: replace Hausa texts are often characterized by writing anomalies, such as incorrect character substitutions and spacing errors, which sometimes hinder natural language processing (NLP) applications. This paper presents an approach to automatically correct anomalies by finetuning transformer-based models. Using a corpus gathered from several public sources, we create a large-scale parallel dataset of over 400,000 noisy-clean Hausa sentence pairs by.