cs.CL

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

arXiv:2602.12005v3 Announce Type: replace
Abstract: Language models have consistently grown to compress more world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially…