Natasha Jaques
Natasha Jaques
Awards
Press
Featured
Publications
Topics
Talks
Communities
Light
Dark
Automatic
Offline RL
Human-Centric Dialog Training via Offline Reinforcement Learning
We train dialog models with interactive data from conversations with real humans, using a novel Offline RL technique based on KL-control. Rather than rely on manual ratings, we learn from implicit signals like sentiment, and show that this results in better performance.
Natasha Jaques
*
,
J. H. Shen
*
,
A. Ghandeharioun
,
C. Ferguson
,
A. Lapedriza
,
N. Jones
,
S. Gu
,
R. Picard
2020
In
Empirical Methods in Natural Language Processing (EMNLP)
PDF
Cite
Code
Dataset
Slides
EMNLP Talk
Cite
×