AI RESEARCH

Beyond Words: Enhancing Desire, Emotion, and Sentiment Recognition with Non-Verbal Cues

arXiv CS.CL

ArXi:2509.15540v2 Announce Type: replace-cross Multimodal desire understanding, a task closely related to both emotion and sentiment that aims to infer human intentions from visual and textual cues, is an emerging yet underexplored task in affective computing with applications in social media analysis. Existing methods for related tasks predominantly focus on mining verbal cues, often overlooking the effective utilization of non-verbal cues embedded in images.