cs.CL

ReTraceQA: Evaluating Reasoning Traces of Small Language Models in Commonsense Question Answering

arXiv:2510.09351v2 Announce Type: replace
Abstract: While Small Language Models (SLMs) have demonstrated promising performance on an increasingly wide array of commonsense reasoning benchmarks, current evaluation practices rely almost exclusively on t…