AI RESEARCH

MINOS: A Multimodal Evaluation Model for Bidirectional Generation Between Image and Text

arXiv CS.AI

ArXi:2506.02494v2 Announce Type: replace-cross Evaluation is important for multimodal generation tasks, while traditional multimodal evaluation metrics suffer from several limitations. With the rapid progress of MLLMs, there is growing interest in applying MLLMs to build general evaluation systems. However, existing researches often simply collect large-scale evaluation data for