AI RESEARCH

Multimodal Large Language Models as Image Classifiers

arXiv CS.CV

ArXi:2603.06578v1 Announce Type: new Multimodal Large Language Models (MLLM) classification performance depends critically on evaluation protocol and ground truth quality. Studies comparing MLLMs with supervised and vision-language models report conflicting