AI RESEARCH

Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation

arXiv CS.CL

ArXi:2603.15227v1 Announce Type: new Machine Translation (MT) evaluation has gone beyond metrics, towards specific linguistic phenomena. Regarding English-Chinese language pairs, passive sentences are constructed and distributed differently due to language variation, thus need special attention in MT. This paper proposes a bidirectional multi-domain dataset of passive sentences, extracted from five Chinese-English parallel corpora and annotated automatically with structure labels according to human translation, and a test set with manually verified annotation.