DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube

ArXi:2603.22977v1 Announce Type: cross Dari, the primary language of Afghanistan, is spoken by tens of millions of people yet remains largely absent from the misinformation detection literature. We address this gap with DariMis, the first manually annotated dataset of 9,224 Dari-language YouTube videos, labeled across two dimensions: Information Type (Misinformation, Partly True, True) and Harm Level (Low, Medium, High