cs.CV, cs.MM, cs.SD

CoSyncDiT: Cognitive Synchronous Diffusion Transformer for Movie Dubbing

arXiv:2604.12292v1 Announce Type: cross
Abstract: Movie dubbing aims to synthesize speech that preserves the vocal identity of a reference audio while synchronizing with the lip movements in a target video. Existing methods fail to achieve precise lip…