Rian Touchent, Eric de la Clergerie

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

Rian Touchent, Eric de la Clergerie / May 13, 2026

arXiv:2605.12438v1 Announce Type: new
Abstract: When adapting an encoder to a new domain, the standard approach is to continue training with Masked Language Modeling (MLM). We show that temporarily switching to Causal Language Modeling (CLM) followed …

Author name: Rian Touchent, Eric de la Clergerie

A Causal Language Modeling Detour Improves Encoder Continued Pretraining