Uncovering Drift in Textual Data: An Unsupervised Method for Detecting and Mitigating Drift in Machine Learning Models

09/07/2023
by   Saeed Khaki, et al.
0

Drift in machine learning refers to the phenomenon where the statistical properties of data or context, in which the model operates, change over time leading to a decrease in its performance. Therefore, maintaining a constant monitoring process for machine learning model performance is crucial in order to proactively prevent any potential performance regression. However, supervised drift detection methods require human annotation and consequently lead to a longer time to detect and mitigate the drift. In our proposed unsupervised drift detection method, we follow a two step process. Our first step involves encoding a sample of production data as the target distribution, and the model training data as the reference distribution. In the second step, we employ a kernel-based statistical test that utilizes the maximum mean discrepancy (MMD) distance metric to compare the reference and target distributions and estimate any potential drift. Our method also identifies the subset of production data that is the root cause of the drift. The models retrained using these identified high drift samples show improved performance on online customer experience quality metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2022

Precise Change Point Detection using Spectral Drift Detection

The notion of concept drift refers to the phenomenon that the data gener...
research
05/28/2023

Reliable and Interpretable Drift Detection in Streams of Short Texts

Data drift is the change in model input data that is one of the key fact...
research
10/10/2018

Adaptive Fraud Detection System Using Dynamic Risk Features

eCommerce transaction frauds keep changing rapidly. This is the major is...
research
03/16/2022

Context-Aware Drift Detection

When monitoring machine learning systems, two-sample tests of homogeneit...
research
11/04/2022

Data Models for Dataset Drift Controls in Machine Learning With Images

Camera images are ubiquitous in machine learning research. They also pla...
research
09/07/2021

LEAF: Navigating Concept Drift in Cellular Networks

Operational networks commonly rely on machine learning models for many t...
research
12/12/2020

Concept Drift Monitoring and Diagnostics of Supervised Learning Models via Score Vectors

Supervised learning models are one of the most fundamental classes of mo...

Please sign up or login with your details

Forgot password? Click here to reset