AI RESEARCH
Span-Level Machine Translation Meta-Evaluation
arXiv CS.AI
•
ArXi:2603.19921v1 Announce Type: cross Machine Translation (MT) and automatic MT evaluation have improved dramatically in recent years, enabling numerous novel applications. Automatic evaluation techniques have evolved from producing scalar quality scores to precisely locating translation errors and assigning them error categories and severity levels. However, it remains unclear how to reliably measure the evaluation capabilities of auto-evaluators that do error detection, as no established technique exists in the literature.