AI RESEARCH
ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments
arXiv CS.CV
•
ArXi:2603.06648v1 Announce Type: new Recent advances in multimodal large language models (MLLMs) offer a promising approach for natural language-based scene change queries in virtual reality (VR). Prior work on applying MLLMs for object state understanding has focused on egocentric videos that capture the camera wearer's interactions with objects. However, object state changes may occur in the background without direct user interaction, lacking explicit motion cues and making them difficult to detect. Moreover, no benchmark exists for evaluating this challenging scenario.