ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation
arXiv:2604.22169v1 Announce Type: cross
Abstract: Generic group-based RL assumes that sampled rollout groups are already usable learning signals. We show that this assumption breaks down in sparse-hit generative recommendation, where many sampled grou…