OpenAI Fellows Fall 2018
We’re now accepting applications for the next cohort of OpenAI Fellows, a program which offers a compensated 6-month apprenticeship in AI research at OpenAI.
We’re now accepting applications for the next cohort of OpenAI Fellows, a program which offers a compensated 6-month apprenticeship in AI research at OpenAI.
We’re releasing an analysis showing that since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.4-month doubling time (by comparison, Moore’s Law had a 2-year doubling period)[^footnote-correct…
The full implementation is available in lilianweng/deep-reinforcement-learning-gym
In the previous two posts, I have introduced the algorithms of many deep reinforcement learning models. Now it is the time to get our hands dirty and practice how to im…
We’re proposing an AI safety technique which trains agents to debate topics with one another, using a human to judge who wins.
We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test…
UPDATE: Unfortunately my Pull-Request to Keras that changed the behaviour of the Batch Normalization layer was not accepted. You can read the details here. For those of you who are brave enough to mess with custom implementations, you can find the code…
[Updated on 2018-06-30: add two new policy gradient methods, SAC and D4PG.]
[Updated on 2018-09-30: add a new policy gradient method, TD3.]
[Updated on 2019-02-09: add SAC with automatically adjusted temperature].
[Updated on 2019-06-26: Thanks to …
We’re launching a transfer learning contest that measures a reinforcement learning algorithm’s ability to generalize from previous experience.