cs.AI, cs.CL

LOGICAL-COMMONSENSEQA: A Benchmark for Logical Commonsense Reasoning

arXiv:2601.16504v3 Announce Type: replace
Abstract: Commonsense reasoning often involves evaluating multiple plausible interpretations rather than selecting a single atomic answer, yet most benchmarks rely on single-label evaluation, obscuring whether…