FASTER: Value-Guided Sampling for Fast RL
arXiv:2604.19730v1 Announce Type: new
Abstract: Some of the most performant reinforcement learning algorithms today can be prohibitively expensive as they use test-time scaling methods such as sampling multiple action candidates and selecting the best…