Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation

ArXi:2511.17290v2 Announce Type: replace In this paper, we present a localized and culturally adapted Estonian translation of the test set from the widely used commonsense reasoning benchmark, WinoGrande. We detail the translation and adaptation process carried out by translation specialists and evaluate the performance of both