Rethinking Token Prediction: Tree-Structured Diffusion Language Model
arXiv:2604.03537v1 Announce Type: new
Abstract: Discrete diffusion language models have emerged as a competitive alternative to auto-regressive language models, but training them efficiently under limited parameter and memory budgets remains challengi…