cs.AI, cs.CL

A Theoretical Analysis of Why Masked Diffusion Models Mitigate the Reversal Curse

arXiv:2602.02133v2 Announce Type: replace-cross
Abstract: Autoregressive language models (ARMs) suffer from the reversal curse: after learning ”$A$ is $B$,” they often fail on the reverse query ”$B$ is $A$.” Masked diffusion language models (MDMs)…