Learning to summarize with human feedback
We’ve applied reinforcement learning from human feedback to train language models that are better at summarization.
August 2020 gwern.net newsletter
with an essay on sidenotes; links on human competence, efficient-computing/hardware-overhangs; no reviews.
Thread: Differentiable Self-organizing Systems
A collection of articles and comments with the goal of understanding how to design robust and general purpose self-organizing systems.
Self-classifying MNIST Digits
Training an end-to-end differentiable, self-organising cellular automata for classifying MNIST digits.
Interpretable Machine Learning
In this blog post, I am (briefly) reviewing Christoph Molnar’s *Interpretable Machine Learning Book*. Then, I am writing about two classic generalized…
July 2020 gwern.net newsletter
Links on the Uighurs, authoritarianism, negative emissions, AI overhang; 1 movie & 2 anime reviews
Neural Architecture Search
Although most popular and successful model architectures are designed by human experts, it doesn’t mean we have explored the entire network architecture space and settled down with the best option. We would have a better chance to find the optim…
Datumbox Machine Learning Framework v0.8.2 released
The Datumbox Framework v0.8.2 has been released! Download it now from GitHub or Maven Central Repository. What is new? The version 0.8.2 is a limited incremental release that focuses on resolving bugs and updating the dependencies of the framework. Her…