Shentong Mo - Provide.ai

Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning

Shentong Mo / April 16, 2026

arXiv:2604.13504v1 Announce Type: cross
Abstract: Designing effective reward functions is a cornerstone of reinforcement learning (RL), yet it remains a challenging and labor-intensive process due to the inefficiencies and inconsistencies inherent in …

Author name: Shentong Mo

Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning