Natasha Jaques
Natasha Jaques
Awards
Press
Featured
Publications
Topics
Talks
Communities
Light
Dark
Automatic
Generalization
In the ZONE: Measuring difficulty and progression in curriculum generation
Past work on curriculum generation in RL has focused on training a teacher agent to generate tasks for a student agent that accelerate student learning and improve generalization. In this work, we create a mathematical framework that formalizes these concepts and subsumes prior work, taking inspiration from the psychological concept of the Zone of Proximal Development. We propose two new techniques based on rejection sampling and maximizing the student’s gradient norm that improve curriculum learning.
R. E. Wang
,
J. Mu
,
D. Arumugam
,
Natasha Jaques
,
N. Goodman
2022
In
Preprint
Cite
Environment Generation for Zero-Shot Compositional Reinforcement Learning
We analyze and improve upon PAIRED in the case of learning to generate challenging compositional tasks. We apply our improved algorithm to the complex task of training RL agents to navigate websites, and find that it is able to generating a challenging curriculum of novel sites. We achieve a 4x improvement over the strongest web navigation baselines, and deploy our model to navigate real-world websites..
I. Gur
,
Natasha Jaques
,
K. Malta
,
M. Tiwari
,
H. Lee
,
A. Faust
2021
In
Neural Information Processing Systems (NeurIPS)
PDF
Cite
Code
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
PsiPhi-Learning learns successor representations for the policies of other agents and the ego agent, using a shared underlying state representation. Learning from other agents helps the agent take better actions at inference time, and learning from RL experience improves modeling of other agents.
A. Filos
,
C. Lyle
,
Y. Gal
,
S. Levine
,
Natasha Jaques
*
,
G. Farquhar
*
2021
In
International Conference on Machine Learning (ICML)
Oral (top 3% of submissions)
PDF
Cite
Project
ICML talk
Emergent Social Learning via Multi-agent Reinforcement Learning
Model-free RL agents fail to learn from experts present in multi-agent environments. By adding a model-based auxiliary loss, we induce social learning, which allows agents to learn how to learn from experts. When deployed to novel environments with new experts, they use social learning to determine how to solve the task, and generalize better than agents trained alone with RL or imitation learning.
Kamal Ndousse
,
Douglas Eck
,
Sergey Levine
,
Natasha Jaques
2021
In
International Conference on Machine Learning (ICML);
NeurIPS Cooperative AI Workshop
Best Paper
PDF
Cite
Code
Poster
Slides
Cooperative AI talk
ICML talk
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
PAIRED trains an agent to generate environments that maximize regret between a pair of learning agents. This creates feasible yet challenging environments, which exploit weaknesses in the agents to make them more robust. PAIRED significantly improves generalization to novel tasks.
Michael Dennis
*
,
Natasha Jaques
*
,
Eugene Vinitsky
,
Alexandre Bayen
,
Stuart Russell
,
Andrew Critch
,
Sergey Levine
2020
In
Neural Information Processing Systems (NeurIPS)
Oral (top 1% of submissions)
PDF
Cite
Code
Poster
Slides
Videos
NeurIPS Oral
Science article
Google AI Blog
Social and Affective Machine Learning
My PhD Thesis spans both Social Reinforcement Learning and Affective Computing, investigating how affective and social intelligence can enhance machine learning algorithms, and how machine learning can enhance our ability to predict and understand human affective and social phenomena.
Natasha Jaques
2019
In
Massachusetts Institute of Technology
PDF
Cite
Thesis Defense
CV News write-up
Multimodal Autoencoder: A Deep Learning Approach to Filling in Missing Sensor Data and Enabling Better Mood Prediction
Predicting signals like stress and health depends on collecting noisy data from a number of modalities, e.g. smartphone data, or physiological data from a wrist-worn sensor. Our method can continue making accurate predictions even when a modality goes missing; for example, if the person forgets to wear their sensor.
Natasha Jaques
,
S. Taylor
,
A. Sano
,
R. Picard
2017
In
International Conference on Affective Computing and Intelligent Interaction (ACII)
PDF
Cite
Code
Slides
Personalized Multitask Learning for Predicting Tomorrow's Mood, Stress, and Health
Traditional, one-size-fits-all machine learning models fail to account for individual differences in predicting wellbeing outcomes like stress, mood, and health. Instead, we personalize models to the individual using multi-task learning (MTL), employing hierarchical Bayes, kernel-based and deep neural network MTL models to improve prediction accuracy by 13-23%.
Natasha Jaques
*
,
S. Taylor
*
,
E. Nosakhare
,
A. Sano
,
R. Picard
2017
In
IEEE Transactions on Affective Computing (TAFFC)
Best Paper
;
NeurIPS Machine Learning for Healthcare (ML4HC) Workshop
Best Paper
Cite
Code
Video
ML4HC Best Paper
TAFFC Journal Best Paper
Predicting Tomorrow’s Mood, Health, and Stress Level using Personalized Multitask Learning and Domain Adaptation
Modeling measures like mood, stress, and health using a monolithic machine learning model leads to low prediction accuracy. Instead, we develop personalized regression models using multi-task learning and Gaussian Processes, leading to dramatic improvements in next-day predictions.
Natasha Jaques
,
O. Rudovic
,
S. Taylor
,
A. Sano
,
R. Picard
2017
In
Proceedings of Machine Learning Research
PDF
Cite
Slides
Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
To combine supervised learning on data with reinforcement learning, we pre-train a supervised data prior, and penalize KL-divergence from this model using RL training. This enables effective learning of complex sequence-modeling problems for which we wish to match the data while optimizing external metrics like drug effectiveness. The approach produces compelling results in the disparate domains of music generation and drug discovery.
Natasha Jaques
,
S. Gu
,
D. Bahdanau
,
J. M. Hernandez-Lobato
,
R. E. Turner
,
D. Eck
2017
In
International Conference on Machine Learning (ICML)
PDF
Cite
Code
ICML talk
Generated music
Magenta blog
MIT Tech Review article
»
Cite
×