Natasha Jaques
Natasha Jaques
Awards
Press
Featured
Publications
Topics
Talks
Communities
Light
Dark
Automatic
Emergent Complexity
In the ZONE: Measuring difficulty and progression in curriculum generation
Past work on curriculum generation in RL has focused on training a teacher agent to generate tasks for a student agent that accelerate student learning and improve generalization. In this work, we create a mathematical framework that formalizes these concepts and subsumes prior work, taking inspiration from the psychological concept of the Zone of Proximal Development. We propose two new techniques based on rejection sampling and maximizing the student’s gradient norm that improve curriculum learning.
R. E. Wang
,
J. Mu
,
D. Arumugam
,
Natasha Jaques
,
N. Goodman
2022
In
Preprint
Cite
Environment Generation for Zero-Shot Compositional Reinforcement Learning
We analyze and improve upon PAIRED in the case of learning to generate challenging compositional tasks. We apply our improved algorithm to the complex task of training RL agents to navigate websites, and find that it is able to generating a challenging curriculum of novel sites. We achieve a 4x improvement over the strongest web navigation baselines, and deploy our model to navigate real-world websites..
I. Gur
,
Natasha Jaques
,
K. Malta
,
M. Tiwari
,
H. Lee
,
A. Faust
2021
In
Neural Information Processing Systems (NeurIPS)
PDF
Cite
Code
Explore and Control with Adversarial Surprise
Adversarial Surprise creates a competitive game between an Expore policy and a Control policy, which fight to maximize and minimize the amount of entropy an RL agent experiences. We show both theoretically and empirically that this technique fully explores the state space of partially-observed, stochastic environments.
A. Fickinger
*
,
Natasha Jaques
*
,
S. Parajuli
,
M. Chang
,
N. Rhinehart
,
G. Berseth
,
S. Russell
,
S. Levine
2021
In
ICML Unsupervised Reinforcement Learning workshop
PDF
Cite
Code
Project
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
PAIRED trains an agent to generate environments that maximize regret between a pair of learning agents. This creates feasible yet challenging environments, which exploit weaknesses in the agents to make them more robust. PAIRED significantly improves generalization to novel tasks.
Michael Dennis
*
,
Natasha Jaques
*
,
Eugene Vinitsky
,
Alexandre Bayen
,
Stuart Russell
,
Andrew Critch
,
Sergey Levine
2020
In
Neural Information Processing Systems (NeurIPS)
Oral (top 1% of submissions)
PDF
Cite
Code
Poster
Slides
Videos
NeurIPS Oral
Science article
Google AI Blog
Cite
×