Hanna Lee, Tan Dat Nguyen, Jaehoon Kang, Kyuhong Shim

WAND: Windowed Attention and Knowledge Distillation for Efficient Autoregressive Text-to-Speech Models

Hanna Lee, Tan Dat Nguyen, Jaehoon Kang, Kyuhong Shim / April 13, 2026

arXiv:2604.08558v1 Announce Type: new
Abstract: Recent decoder-only autoregressive text-to-speech (AR-TTS) models produce high-fidelity speech, but their memory and compute costs scale quadratically with sequence length due to full self-attention. In …

Author name: Hanna Lee, Tan Dat Nguyen, Jaehoon Kang, Kyuhong Shim

WAND: Windowed Attention and Knowledge Distillation for Efficient Autoregressive Text-to-Speech Models