Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity
arXiv:2605.14487v1 Announce Type: cross
Abstract: Autoregressive video diffusion models support real-time synthesis but suffer from error accumulation and context loss over long horizons. We discover that attention heads in AR video diffusion transfor…