Less is More: Data-Efficient Adaptation for Controllable Text-to-Video Generation
arXiv:2511.17844v4 Announce Type: replace-cross
Abstract: Fine-tuning large-scale text-to-video diffusion models to add new generative controls, such as those over physical camera parameters (e.g., shutter speed or aperture), typically requires vast, …