cs.CV, cs.LG

RefDecoder: Enhancing Visual Generation with Conditional Video Decoding

arXiv:2605.15196v1 Announce Type: cross
Abstract: Video generation powers a vast array of downstream applications. However, while the de facto standard, i.e., latent diffusion models, typically employ heavily conditioned denoising networks, their deco…