Rethinking Temporal Consistency in Video Object-Centric Learning: From Prediction to Correspondence
arXiv:2605.03650v1 Announce Type: cross
Abstract: The de facto approach in video object-centric learning maintains temporal consistency through learned dynamics modules that predict future object representations, called slots. We demonstrate that thes…