Are vision-language models ready to zero-shot replace supervised classification models in agriculture?

ArXi:2512.15977v2 Announce Type: replace Vision-language models (VLMs) are increasingly proposed as general-purpose solutions for visual recognition tasks, yet their reliability for agricultural decision remains poorly understood. We benchmark a diverse set of open-source and closed-source VLMs on 27 agricultural image classification datasets from the AgML collection ( spanning 162 classes and 248,000 images across plant disease, pest and damage, and plant and weed species identification.