Safety Gym
We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.
We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.
[Updated on 2020-01-09: add a new section on Contrastive Predictive Coding].
[Updated on 2020-04-13: add a “Momentum Contrast” section on MoCo, SimCLR and CURL.]
[Updated on 2020-07-08: add a “Bisimulation” section on DeepMDP…
DISCLAIMER: NO INVESTMENT OR LEGAL ADVICEThe Content is for informational purposes only, you should not construe any such information or other material as legal, tax, investment, financial, or other advice. Investing involves risk, please consult a fin…
As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models rele…
Detailed derivations and open-source code to analyze the receptive fields of convnets.
The way we drive is changing. From futuristic Hollywood movies to sci-fi fiction novels, the idea of driverless cars has been around for some time now – but it’s never felt as close to becoming reality as it does today. The UK’s De…
<!–
–>
<!–
–>
<!–
–>
<!–Evolved Biped Walker.
–>
Rather than hardcoding forward prediction, we try to get agents to learn that they need to predict the future.
<!––>
GitHub
Redirecting to learningtopredict.github.io, where the article resides.
We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automat…
We are now accepting applications for our third class of OpenAI Scholars.