Multimodal Diffusion Forcing for Forceful Manipulation
arXiv:2511.04812v2 Announce Type: replace-cross
Abstract: Given a dataset of expert trajectories, standard imitation learning approaches typically learn a direct mapping from observations (e.g., RGB images) to actions. However, such methods often over…