Khala: Scaling Acoustic Token Language Models Toward High-Fidelity Music Generation
arXiv:2605.01790v1 Announce Type: cross
Abstract: A common design pattern in high-quality music generation is to handle structure and fidelity in different representation spaces: a generator first models high-level structure, followed by diffusion-bas…