Datumbox Machine Learning Framework v0.8.1 released
The Datumbox v0.8.1 has been released! Download it now from Github or Maven Central Repository. What is new? The main focus of version 0.8.1 is to resolve various bugs, update the depedencies and improve the code architecture of the framework. Here are…
From GAN to WGAN
[Updated on 2018-09-30: thanks to Yoonju, we have this post translated in Korean!]
[Updated on 2019-04-18: this post is also available on arXiv.]
Generative adversarial network (GAN) has shown great results in many generative tasks to replicate the r…
OpenAI Baselines: ACKTR & A2C
We’re releasing two new OpenAI Baselines implementations: ACKTR and A2C. A2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) which we’ve found gives equal performance. ACKTR is a more sample-efficient reinforcement …
More on Dota 2
Our Dota 2 result shows that self-play can catapult the performance of machine learning systems from far below human level to superhuman, given sufficient compute. In the span of a month, our system went from barely matching a high-ranked player to bea…
Gathering human feedback
RL-Teacher is an open-source implementation of our interface to train AIs via occasional human feedback rather than hand-crafted reward functions. The underlying technique was developed as a step towards safe AI systems, but also applies to reinforceme…
How to Explain the Prediction of a Machine Learning Model?
The machine learning models have started penetrating into critical areas like health care, justice systems, and financial industry. Thus to figure out how the models make the decisions and make sure the decisioning process is aligned with the ethnic r…
Better exploration with parameter noise
We’ve found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it’s worth trying on any problem.
Predict Stock Prices Using RNN: Part 2
In the Part 2 tutorial, I would like to continue the topic on stock price prediction and to endow the recurrent neural network that I have built in Part 1 with the capability of responding to multiple stocks. In order to distinguish the patterns assoc…