cs.CL, cs.LG

A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

arXiv:2603.07475v2 Announce Type: replace
Abstract: Autoregressive (AR) language models build representations incrementally via left-to-right prediction, while diffusion language models (dLLMs) are trained through full-sequence denoising. Although rec…