cs.AI, cs.CL, cs.SD, eess.AS

ControlAudio: Tackling Text-Guided, Timing-Indicated and Intelligible Audio Generation via Progressive Diffusion Modeling

arXiv:2510.08878v3 Announce Type: replace-cross
Abstract: Text-to-audio (TTA) generation with fine-grained control signals, e.g., precise timing control or intelligible speech content, has been explored in recent works. However, constrained by data sc…