cs.AI, cs.CL

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

arXiv:2605.12438v1 Announce Type: new
Abstract: When adapting an encoder to a new domain, the standard approach is to continue training with Masked Language Modeling (MLM). We show that temporarily switching to Causal Language Modeling (CLM) followed …