AI RESEARCH

DIP: Efficient Large Multimodal Model Training with Dynamic Interleaved Pipeline

arXiv CS.AI

ArXi:2504.14145v2 Announce Type: replace-cross Large multimodal models (LMMs) have nstrated excellent capabilities in both understanding and generation tasks with various modalities. While these models can accept flexible combinations of input data, their