ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models

ArXi:2603.19466v1 Announce Type: new Effective collaboration begins with knowing when to ask for help. For example, when trying to identify an occluded object, a human would ask someone to remove the obstruction. Can MLLMs exhibit a similar "proactive" behavior by requesting simple user interventions? To investigate this, we