Exploring Token-Space Manipulation in Latent Audio Tokenizers
arXiv:2605.11192v1 Announce Type: cross
Abstract: Neural audio codecs provide compact discrete representations for speech generation and manipulation. However, most codecs organize tokens as frame-level sequences, making it difficult to study or inter…