SayNext-Bench: Why Do LLMs Struggle with Next-Utterance Anticipation?

ArXi:2602.00327v2 Announce Type: replace We explore the use of large language models (LLMs) for next-utterance anticipation in human dialogue. Despite recent advances in LLMs nstrating their ability to engage in natural conversations with users, we show that even leading models surprisingly struggle to anticipate a human speaker's next utterance. Instead, humans can readily anticipate forthcoming utterances based on multi-modal cues -- such as gestures, gaze, and emotional tone -- from the context.