RefereeBench: Are Video MLLMs Ready to be Multi-Sport Referees

ArXi:2604.15736v1 Announce Type: cross While Multimodal Large Language Models (MLLMs) excel at generic video understanding, their ability to specialized, rule-grounded decision-making remains insufficiently explored. In this paper, we