AI RESEARCH

Navigating the Emotion Tree: Hierarchical Hyperbolic RAG for Multimodal Emotion Recognition

arXiv CS.LG

ArXi:2605.18884v1 Announce Type: new Multimodal emotion recognition aims to integrate text, audio, and video sources to understand human affective states. Although multimodal large language models excel at multimodal reasoning, they typically treat emotion categories as independent labels, ignoring the rich hierarchical taxonomy of human psychology. Moreover, lacking external contextual knowledge makes them highly susceptible to over-interpreting noisy cues, further complicating fine-grained emotion classification.