Leveraging User Engagement Signals For Entity Labeling in a Virtual Assistant

09/18/2019
by   Deepak Muralidharan, et al.
0

Personal assistant AI systems such as Siri, Cortana, and Alexa have become widely used as a means to accomplish tasks through natural language commands. However, components in these systems generally rely on supervised machine learning algorithms that require large amounts of hand-annotated training data, which is expensive and time consuming to collect. The ability to incorporate unsupervised, weakly supervised, or distantly supervised data holds significant promise in overcoming this bottleneck. In this paper, we describe a framework that leverages user engagement signals (user behaviors that demonstrate a positive or negative response to content) to automatically create granular entity labels for training data augmentation. Strategies such as multi-task learning and validation using an external knowledge base are employed to incorporate the engagement annotated data and to boost the model's accuracy on a sequence labeling task. Our results show that learning from data automatically labeled by user engagement signals achieves significant accuracy gains in a production deep learning system, when measured on both the sequence labeling task as well as on user facing results produced by the system end-to-end. We believe this is the first use of user engagement signals to help generate training data for a sequence labeling task on a large scale, and can be applied in practical settings to speed up new feature deployment when little human annotated data is available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2021

HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Open-domain dialog systems have a user-centric goal: to provide humans w...
research
09/16/2021

KnowMAN: Weakly Supervised Multinomial Adversarial Networks

The absence of labeled data for training neural models is often addresse...
research
09/29/2019

Semi-Supervised Neural Text Generation by Joint Learning of Natural Language Generation and Natural Language Understanding Models

In Natural Language Generation (NLG), End-to-End (E2E) systems trained t...
research
05/13/2020

Adaptive Rule Discovery for Labeling Text Data

Creating and collecting labeled data is one of the major bottlenecks in ...
research
05/04/2021

An Estimation of Online Video User Engagement from Features of Continuous Emotions

Portraying emotion and trustworthiness is known to increase the appeal o...
research
06/21/2021

Demonstration of Panda: A Weakly Supervised Entity Matching System

Entity matching (EM) refers to the problem of identifying tuple pairs in...
research
10/21/2020

Complex data labeling with deep learning methods: Lessons from fisheries acoustics

Quantitative and qualitative analysis of acoustic backscattered signals ...

Please sign up or login with your details

Forgot password? Click here to reset