Research

Deep double descent

We show that the double descent phenomenon occurs in CNNs, ResNets, and transformers: performance first improves, then gets worse, and then improves again with increasing model size, data size, or training time. This effect is often avoided through car…

Research

Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

Safety & Alignment

Safety Gym

We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.

Uncategorised

Self-Supervised Representation Learning

[Updated on 2020-01-09: add a new section on Contrastive Predictive Coding].

[Updated on 2020-04-13: add a “Momentum Contrast” section on MoCo, SimCLR and CURL.]

[Updated on 2020-07-08: add a “Bisimulation” section on DeepMDP…

finance, leverage

Robinhood, Leverage, and Lemonade

DISCLAIMER: NO INVESTMENT OR LEGAL ADVICEThe Content is for informational purposes only, you should not construe any such information or other material as legal, tax, investment, financial, or other advice. Investing involves risk, please consult a fin…

Research

GPT-2: 1.5B release

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models rele…

Scroll to Top