AI RESEARCH

On the Effectiveness of Integration Methods for Multimodal Dialogue Response Retrieval

arXiv CS.CL

ArXi:2506.11499v2 Announce Type: replace Multimodal chatbots have become one of the major topics for dialogue systems in both research community and industry. Recently, researchers have shed light on the multimodality of responses as well as dialogue contexts. This work explores how a dialogue system can output responses in various modalities such as text and image. To this end, we first formulate a multimodal dialogue response retrieval task for retrieval-based systems as the combination of three subtasks.