Video Generation with Predictive Latents
arXiv:2605.02134v1 Announce Type: new
Abstract: Video Variational Autoencoder (VAE) enables latent video generative modeling by mapping the visual world into compact spatiotemporal latent spaces, improving training efficiency and stability. While exis…