Unsupervised Learning of KB Queries in Task Oriented Dialogs
Task-oriented dialog (TOD) systems converse with users to accomplish a specific task. This task requires the system to query a knowledge base (KB) and use the retrieved results to fulfil user needs. Predicting the KB queries is crucial and can lead to severe under-performance if made incorrectly. KB queries are usually annotated in real-world datasets and are learnt using supervised approaches to achieve acceptable task completion. This need for query annotations prevents TOD systems from easily adapting to new domains. In this paper, we propose a novel problem of learning end-to-end TOD systems using dialogs that do not contain KB query annotations. Our approach first learns to predict the KB queries using reinforcement learning (RL) and then learns the end-to-end system using the predicted queries. However, predicting the correct query in TOD systems is uniquely plagued by correlated attributes, in which, due to data bias, certain attributes always occur together in the KB. This prevents the RL system to generalise and accuracy suffers as a result. We propose Correlated Attributes Resilient RL (CARRL), a modification to the RL gradient estimation, which mitigates the problem of correlated attributes and predicts KB queries better than existing weakly supervised approaches. Finally, we compare the performance of our end-to-end system trained using predicted queries to a system trained using annotated gold queries.
READ FULL TEXT