Kausthubh Manda, Raghuram Bharadwaj Diddigi

Generalisation in Multitask Fitted Q-Iteration and Offline Q-learning

Kausthubh Manda, Raghuram Bharadwaj Diddigi / April 28, 2026

arXiv:2512.20220v2 Announce Type: replace
Abstract: We study offline multitask reinforcement learning in settings where multiple tasks share a low-rank representation of their action-value functions. In this regime, a learner is provided with fixed da…

Author name: Kausthubh Manda, Raghuram Bharadwaj Diddigi

Generalisation in Multitask Fitted Q-Iteration and Offline Q-learning