AI RESEARCH
Multimodal Large Language Models as Image Classifiers
arXiv CS.CV
•
ArXi:2603.06578v1 Announce Type: new Multimodal Large Language Models (MLLM) classification performance depends critically on evaluation protocol and ground truth quality. Studies comparing MLLMs with supervised and vision-language models report conflicting