cs.AI, cs.CL

Dual-objective Language Models: Training Efficiency Without Overfitting

arXiv:2512.14549v3 Announce Type: replace
Abstract: This paper combines autoregressive and masked-diffusion training objectives without any architectural modifications, resulting in flexible language models that outperform single-objective models. Aut…