AI RESEARCH

Complementarity-Supervised Spectral-Band Routing for Multimodal Emotion Recognition

arXiv CS.CV

ArXi:2603.13340v1 Announce Type: new Multimodal emotion recognition fuses cues such as text, video, and audio to understand individual emotional states. Prior methods face two main limitations: mechanically relying on independent unimodal performance, thereby missing genuine complementary contributions, and coarse-grained fusion conflicting with the fine-grained representations required by emotion tasks.