When the Gold Standard Isn't Necessarily Standard: Challenges of Evaluating the Translation of User-Generated Content

ArXi:2512.17738v2 Announce Type: replace User-generated content (UGC) is characterised by frequent use of non-standard language, from spelling errors to expressive choices such as slang, character repetitions, and emojis. This makes evaluating UGC translation challenging: what counts as a "good" translation depends on the desired standardness level of the output. To explore this, we examine the human translation guidelines of four UGC datasets, and derive a taxonomy of twelve non-standard phenomena and five translation actions.