cs.LG

Offline Constrained RLHF with Multiple Preference Oracles

arXiv:2604.00200v1 Announce Type: new
Abstract: We study offline constrained reinforcement learning from human feedback with multiple preference oracles. Motivated by applications that trade off performance with safety or fairness, we aim to maximize …