Provide.ai - We Provide AI To Companies - Page 5039

Gotta Learn Fast: A new benchmark for generalization in RL

OpenAI News / April 10, 2018

Uncategorised

Policy Gradient Algorithms

Posts on Lil'Log / April 8, 2018

[Updated on 2018-06-30: add two new policy gradient methods, SAC and D4PG.]

[Updated on 2018-09-30: add a new policy gradient method, TD3.]

[Updated on 2019-02-09: add SAC with automatically adjusted temperature].

[Updated on 2019-06-26: Thanks to …

Retro Contest

OpenAI News / April 5, 2018

We’re launching a transfer learning contest that measures a reinforcement learning algorithm’s ability to generalize from previous experience.

Uncategorised

World Models

大トロ / March 27, 2018

<!–
–>
<!–
–>

<!–
–>
<!–Evolved Biped Walker.
–>

Can agents learn inside of their own dreams?
<!–GitHub–>

Redirecting to worldmodels.github.io, where the article resides.

Variance reduction for policy gradient with action-dependent factorized baselines

OpenAI News / March 20, 2018

Improving GANs using optimal transport

OpenAI News / March 15, 2018

Report from the OpenAI hackathon

OpenAI News / March 15, 2018

On March 3rd, we hosted our first hackathon with 100 members of the artificial intelligence community.

On first-order meta-learning algorithms

OpenAI News / March 8, 2018

Reptile: A scalable meta-learning algorithm

OpenAI News / March 7, 2018

We’ve developed a simple meta-learning algorithm called Reptile which works by repeatedly sampling a task, performing stochastic gradient descent on it, and updating the initial parameters towards the final parameters learned on that task. Reptile is t…

Uncategorised

The Building Blocks of Interpretability

Distill / March 6, 2018

Interpretability techniques are normally studied in isolation. We explore the powerful interfaces that arise when you combine them — and the rich structure of this combinatorial space.