Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants

ArXi:2510.24328v2 Announce Type: replace-cross Large Language Models (LLMs) are increasingly used to answer everyday questions, yet their performance on culturally grounded and dialectal content remains uneven across languages.