Safety & Alignment

Safety Gym

We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.

Uncategorised

Self-Supervised Representation Learning

[Updated on 2020-01-09: add a new section on Contrastive Predictive Coding].

[Updated on 2020-04-13: add a “Momentum Contrast” section on MoCo, SimCLR and CURL.]

[Updated on 2020-07-08: add a “Bisimulation” section on DeepMDP…

finance, leverage

Robinhood, Leverage, and Lemonade

DISCLAIMER: NO INVESTMENT OR LEGAL ADVICEThe Content is for informational purposes only, you should not construe any such information or other material as legal, tax, investment, financial, or other advice. Investing involves risk, please consult a fin…

Research

GPT-2: 1.5B release

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models rele…

Research

Solving Rubik’s Cube with a robot hand

We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automat…

Scroll to Top