On the Nature and Types of Anomalies: A Review

07/30/2020
by   Ralph Foorthuis, et al.
0

Anomalies are occurrences in a dataset that are in some way unusual and do not fit the general patterns. The concept of the anomaly is generally ill-defined and perceived as vague and domain-dependent. Moreover, no comprehensive and concrete overviews of the different types of anomalies have hitherto been published. By means of an extensive literature review this study therefore offers the first theoretically principled and domain-independent typology of data anomalies, and presents a full overview of anomaly types and subtypes. To concretely define the concept of the anomaly and its different manifestations the typology employs four dimensions: data type, cardinality of relationship, data structure and data distribution. These fundamental and data-centric dimensions naturally yield 3 broad groups, 9 basic types and 61 subtypes of anomalies. The typology facilitates the evaluation of the functional capabilities of anomaly detection algorithms, contributes to explainable data science, and provides insights into relevant topics such as local versus global anomalies.

READ FULL TEXT
research
07/04/2021

A Typology of Data Anomalies

Anomalies are cases that are in some way unusual and do not appear to fi...
research
08/27/2020

The Impact of Discretization Method on the Detection of Six Types of Anomalies in Datasets

Anomaly detection is the process of identifying cases, or groups of case...
research
04/25/2020

Urban Anomaly Analytics: Description, Detection, and Prediction

Urban anomalies may result in loss of life or property if not handled pr...
research
08/02/2023

LLMs Understand Glass-Box Models, Discover Surprises, and Suggest Repairs

We show that large language models (LLMs) are remarkably good at working...
research
04/03/2020

Using Large-Scale Anomaly Detection on Code to Improve Kotlin Compiler

In this work, we apply anomaly detection to source code and bytecode to ...
research
08/16/2020

SECODA: Segmentation- and Combination-Based Detection of Anomalies

This study introduces SECODA, a novel general-purpose unsupervised non-p...
research
12/06/2018

Climate Anomalies vs Air Pollution: Carbon Emissions and Anomaly Networks

This project aims to shed light on how man-made carbon emissions are aff...

Please sign up or login with your details

Forgot password? Click here to reset