Flow Matching for Offline Reinforcement Learning with Discrete Actions
arXiv:2602.06138v2 Announce Type: replace
Abstract: Generative policies based on diffusion models and flow matching have shown strong promise for offline reinforcement learning (RL), but their applicability remains largely confined to continuous actio…