MemDLM: Memory-Enhanced DLM Training
arXiv:2603.22241v2 Announce Type: replace
Abstract: Diffusion Language Models (DLMs) offer attractive advantages over Auto-Regressive (AR) models, such as full-attention parallel decoding and flexible generation. However, standard DLM training uses a …