DFM-VLA: Iterative Action Refinement for Robot Manipulation via Discrete Flow Matching
arXiv:2603.26320v3 Announce Type: replace-cross
Abstract: Vision–Language–Action (VLA) models that encode actions using a discrete tokenization scheme are increasingly adopted for robotic manipulation, but existing decoding paradigms remain fundamen…