AI RESEARCH
PolySLGen: Online Multimodal Speaking-Listening Reaction Generation in Polyadic Interaction
arXiv CS.CV
•
ArXi:2604.08125v1 Announce Type: new Human-like multimodal reaction generation is essential for natural group interactions between humans and embodied AI. However, existing approaches are limited to single-modality or speaking-only responses in dyadic interactions, making them unsuitable for realistic social scenarios. Many also overlook nonverbal cues and complex dynamics of polyadic interactions, both critical for engagement and conversational coherence. In this work, we present PolySLGen, an online framework for Polyadic multimodal Speaking and Listening reaction Generation.