AI RESEARCH

CEI: A Benchmark for Evaluating Pragmatic Reasoning in Language Models

arXiv CS.AI

ArXi:2603.09993v1 Announce Type: cross Pragmatic reasoning, inferring intended meaning beyond literal semantics, underpins everyday communication yet remains difficult for large language models. We present the Contextual Emotional Inference (CEI) Benchmark: 300 human-validated scenarios for evaluating how well LLMs disambiguate pragmatically complex utterances. Each scenario pairs a situational context and speaker-listener roles (with explicit power relations) against an ambiguous utterance.