Institute for Pure and Applied Mathematics (IPAM) Mathematics of Intelligence workshop


Social learning helps humans and animals rapidly adapt to new circumstances, and drives the emergence of complex learned behaviors. This talk focuses on how Reinforcement Learning (RL) agents can benefit from social learning in multi-agent environments. First, I will demonstrate how multi-agent training can be a useful tool for improving learning and generalization. I will present PAIRED, in which an adversary learns to construct training environments to maximize regret between a pair of learners, leading to the generation of a complex curriculum of environments that improve the learner’s zero-shot transfer to unknown, single-agent test tasks. Second, I will explore social learning in naturalistic multi-agent environments, in which there are other agents that may have relevant knowledge, but they are not explicitly interested in teaching the RL agent (analogous to the autonomous driving setting). I first show that traditional model-free RL algorithms do not benefit from social learning in such contexts, and cannot discover the optimal policy even when nearby agents are visibly following it. I will present a technique for enabling social learning, and demonstrate that agents which successfully engage in social learning can generalize better to new environments. We then introduce an improved method for social learning, PsiPhi-learning, which leverages successor features to create a feedback loop in which individual RL experience improves the ability to model other agents, and modeling other agents improves the ability to take individual actions in RL. PsiPhi-learning improves over both traditional RL techniques and recent imitation learning techniques, flexibly benefitting from learning from other agents when it is relevant to the task at hand.

Natasha Jaques
Natasha Jaques

My research is focused on Social Reinforcement Learning–developing algorithms that use insights from social learning to improve AI agents’ learning, generalization, coordination, and human-AI interaction.