AI RESEARCH

ProMQA-Assembly: Multimodal Procedural QA Dataset on Assembly

arXiv CS.CL

ArXi:2509.02949v2 Announce Type: replace Assistants on assembly tasks show great potential to benefit humans ranging from helping with everyday tasks to interacting in industrial settings. However, evaluation resources in assembly activities are underexplored. To foster system development, we propose a new multimodal QA evaluation dataset on assembly activities. Our dataset, ProMQA-Assembly, consists of 646 QA pairs that require multimodal understanding of human activity videos and their instruction manuals in an online-style manner.