Variance reduction for policy gradient with action-dependent factorized baselinesBy OpenAI News / March 20, 2018