ICDAR 2025 Competition on End-to-End Document Image Machine Translation Towards Complex Layouts

ArXi:2603.09392v1 Announce Type: cross Document Image Machine Translation (DIMT) seeks to translate text embedded in document images from one language to another by jointly modeling both textual content and page layout, bridging optical character recognition (OCR) and natural language processing (NLP). The DIMT 2025 Challenge advances research on end-to-end document image translation, a rapidly evolving area within multimodal document understanding.