cs.CV

MMPhysVideo: Scaling Physical Plausibility in Video Generation via Joint Multimodal Modeling

arXiv:2604.02817v1 Announce Type: new
Abstract: Despite advancements in generating visually stunning content, video diffusion models (VDMs) often yield physically inconsistent results due to pixel-only reconstruction. To address this, we propose MMPhy…