AI RESEARCH

Band Together: Untargeted Adversarial Training with Multimodal Coordination against Evasion-based Promotion Attacks

arXiv CS.LG

ArXi:2605.06238v1 Announce Type: new Multimodal recommender systems exploit visual and textual signals to alleviate data sparsity, but this also makes them vulnerable to evasion-based promotion attacks. Existing defenses are largely limited to single-modal settings and mainly focus on poisoning-based threats, leaving evasion-based threats underexplored. In this work, we first identify a cross-modal gradient mismatch under the multi-user promotion setting, where visual and textual perturbations are optimized in inconsistent directions due to the dominance of distinct user groups.