A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets

by   Zvi Kons, et al.

Intent classifiers are vital to the successful operation of virtual agent systems. This is especially so in voice activated systems where the data can be noisy with many ambiguous directions for user intents. Before operation begins, these classifiers are generally lacking in real-world training data. Active learning is a common approach used to help label large amounts of collected user input. However, this approach requires many hours of manual labeling work. We present the Nearest Neighbors Scores Improvement (NNSI) algorithm for automatic data selection and labeling. The NNSI reduces the need for manual labeling by automatically selecting highly-ambiguous samples and labeling them with high accuracy. This is done by integrating the classifier's output from a semantically similar group of text samples. The labeled samples can then be added to the training set to improve the accuracy of the classifier. We demonstrated the use of NNSI on two large-scale, real-life voice conversation systems. Evaluation of our results showed that our method was able to select and label useful samples with high accuracy. Adding these new samples to the training data significantly improved the classifiers and reduced error rates by up to 10


page 1

page 2

page 3

page 4


Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification

Privacy policies are statements that notify users of the services' data ...

Semi-supervised Interactive Intent Labeling

Building the Natural Language Understanding (NLU) modules of task-orient...

Intent Mining from past conversations for Conversational Agent

Conversational systems are of primary interest in the AI community. Chat...

HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

Open-domain dialog systems have a user-centric goal: to provide humans w...

Know What Not To Know: Users' Perception of Abstaining Classifiers

Machine learning systems can help humans to make decisions by providing ...

Grounding Predicates through Actions

Symbols representing abstract states such as "dish in dishwasher" or "cu...

Data Augmentation for Intent Classification with Off-the-shelf Large Language Models

Data augmentation is a widely employed technique to alleviate the proble...

Please sign up or login with your details

Forgot password? Click here to reset