Mean Flow Policy Optimization
arXiv:2604.14698v1 Announce Type: new
Abstract: Diffusion models have recently emerged as expressive policy representations for online reinforcement learning (RL). However, their iterative generative processes introduce substantial training and infere…