Render, Don’t Decode: Weight-Space World Models with Latent Structural Disentanglement
arXiv:2605.06298v1 Announce Type: cross
Abstract: Training world models on vast quantities of unlabelled videos is a critical step toward fully autonomous intelligence. However, the prevailing paradigm of encoding raw pixels into opaque latent spaces …