Unsupervised sentiment neuron
We’ve developed an unsupervised system which learns an excellent representation of sentiment, despite being trained only to predict the next character in the text of Amazon reviews.
Why Momentum Really Works
We often think of optimization with momentum as a ball rolling down a hill. This isn’t wrong, but there is much more to the story.
Spam detection in the physical world
We’ve created the world’s first Spam-detecting AI trained entirely in simulation and deployed on a physical robot.
Evolution strategies as a scalable alternative to reinforcement learning
We’ve discovered that evolution strategies (ES), an optimization technique that’s been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g. Atari/MuJoCo), while overcoming many of RL…
Research Debt
Science is a human activity. When we fail to distill and explain research, we accumulate a kind of debt…
Learning to communicate
In this post we’ll outline new OpenAI research in which agents develop their own language.