AI RESEARCH

MM-LIMA: Less Is More for Alignment in Multi-Modal Datasets

arXiv CS.AI

ArXi:2308.12067v3 Announce Type: replace-cross Multimodal large language models are typically trained in two stages: first pre-