K-Score: Kalman Filter as a Principled Alternative to Reward Normalization in Reinforcement Learning
arXiv:2604.23056v1 Announce Type: cross
Abstract: We propose a simple yet effective alternative to reward normalization in policy gradient reinforcement learning by integrating a 1D Kalman filter for online reward estimation. Instead of relying on fix…