Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

ArXi:2604.16593v1 Announce Type: new We present SemanticQA, an evaluation suite designed to assess language models (LMs) in semantic phrase processing tasks. The benchmark consolidates existing multiword expression (MwE) resources and reorganizes them into a unified testbed. It covers both general lexical phenomena, such as lexical collocations, and three fine-grained categories: idiomatic expressions, noun compounds, and verbal constructions.