cs.LG

Quantile Q-Learning: Revisiting Offline Extreme Q-Learning with Quantile Regression

arXiv:2511.11973v2 Announce Type: replace
Abstract: Offline reinforcement learning (RL) enables policy learning from fixed datasets without further environment interaction, making it particularly valuable in high-risk or costly domains. Extreme $Q$-Le…