arxiv:2202.13675

A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-Oriented Dialogue Policy Learning

Published on Feb 28, 2022

Authors:

Abstract

The paper reviews recent RL advancements and challenges in dialogue policy learning for task-oriented dialogue systems.

AI-generated summary

Dialogue Policy Learning is a key component in a task-oriented dialogue system (TDS) that decides the next action of the system given the dialogue state at each turn. Reinforcement Learning (RL) is commonly chosen to learn the dialogue policy, regarding the user as the environment and the system as the agent. Many benchmark datasets and algorithms have been created to facilitate the development and evaluation of dialogue policy based on RL. In this paper, we survey recent advances and challenges in dialogue policy from the prescriptive of RL. More specifically, we identify the major problems and summarize corresponding solutions for RL-based dialogue policy learning. Besides, we provide a comprehensive survey of applying RL to dialogue policy learning by categorizing recent methods into basic elements in RL. We believe this survey can shed a light on future research in dialogue management.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2202.13675 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2202.13675 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.