Wakeword Detection under Distribution Shifts

We propose a novel approach for semi-supervised learning (SSL) designed to overcome distribution shifts between training and real-world data arising in the keyword spotting (KWS) task. Shifts from training data distribution are a key challenge for real-world KWS tasks: when a new model is deployed on device, the gating of the accepted data undergoes a shift in distribution, making the problem of timely updates via subsequent deployments hard. Despite the shift, we assume that the marginal distributions on labels do not change. We utilize a modified teacher/student training framework, where labeled training data is augmented with unlabeled data. Note that the teacher does not have access to the new distribution as well. To train effectively with a mix of human and teacher labeled data, we develop a teacher labeling strategy based on confidence heuristics to reduce entropy on the label distribution from the teacher model; the data is then sampled to match the marginal distribution on the labels. Large scale experimental results show that a convolutional neural network (CNN) trained on far-field audio, and evaluated on far-field audio drawn from a different distribution, obtains a 14.3 false discovery rate (FDR) at equal false reject rate (FRR), while yielding a 5 distribution shift from far-field to near-field audio with a smaller fully connected network (FCN) our approach achieves a 52 at equal FRR, while yielding a 20 distribution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2022

A Semi-Supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Identifying breakdowns in ongoing dialogues helps to improve communicati...
research
09/25/2019

Teacher-Student Learning Paradigm for Tri-training: An Efficient Method for Unlabeled Data Exploitation

Given that labeled data is expensive to obtain in real-world scenarios, ...
research
06/03/2021

Noisy student-teacher training for robust keyword spotting

We propose self-training with noisy student-teacher approach for streami...
research
08/17/2022

Domestic sound event detection by shift consistency mean-teacher training and adversarial domain adaptation

Semi-supervised learning and domain adaptation techniques have drawn inc...
research
10/17/2022

Dual-Curriculum Teacher for Domain-Inconsistent Object Detection in Autonomous Driving

Object detection for autonomous vehicles has received increasing attenti...
research
06/21/2018

Learning to Rank from Samples of Variable Quality

Training deep neural networks requires many training samples, but in pra...
research
11/14/2022

Self-training of Machine Learning Models for Liver Histopathology: Generalization under Clinical Shifts

Histopathology images are gigapixel-sized and include features and infor...

Please sign up or login with your details

Forgot password? Click here to reset