LOGICAL-COMMONSENSEQA: A Benchmark for Logical Commonsense Reasoning

ArXi:2601.16504v3 Announce Type: replace Commonsense reasoning often involves evaluating multiple plausible interpretations rather than selecting a single atomic answer, yet most benchmarks rely on single-label evaluation, obscuring whether statements are jointly plausible, mutually exclusive, or jointly implausible. We