AI RESEARCH

OmniFusion: Simultaneous Multilingual Multimodal Translations via Modular Fusion

arXiv CS.AI

ArXi:2512.00234v2 Announce Type: replace-cross There has been significant progress in open-source text-only translation large language models (LLMs) with better language coverage and quality. However, these models can be only used in cascaded pipelines for speech translation (ST), performing automatic speech recognition first followed by translation. This