Fairoz Nower Khan, Nabuat Zaman Nahim, Ruiquan Huang, Haibo Yang, Peizhong Ju

Flow Matching for Offline Reinforcement Learning with Discrete Actions

Fairoz Nower Khan, Nabuat Zaman Nahim, Ruiquan Huang, Haibo Yang, Peizhong Ju / May 14, 2026

arXiv:2602.06138v2 Announce Type: replace
Abstract: Generative policies based on diffusion models and flow matching have shown strong promise for offline reinforcement learning (RL), but their applicability remains largely confined to continuous actio…

Author name: Fairoz Nower Khan, Nabuat Zaman Nahim, Ruiquan Huang, Haibo Yang, Peizhong Ju

Flow Matching for Offline Reinforcement Learning with Discrete Actions