The Semantic Adjacency Criterion in Time Intervals Mining

01/11/2021
by   Alexander Shknevsky, et al.
0

Frequent temporal patterns discovered in time-interval-based multivariate data, although syntactically correct, might be non-transparent: For some pattern instances, there might exist intervals for the same entity that contradict the pattern's usual meaning. We conjecture that non-transparent patterns are also less useful as classification or prediction features. We propose a new pruning constraint during a frequent temporal-pattern discovery process, the Semantic Adjacency Criterion [SAC], which exploits domain knowledge to filter out patterns that contain potentially semantically contradictory components. We have defined three SAC versions, and tested their effect in three medical domains. We embedded these criteria in a frequent-temporal-pattern discovery framework. Previously, we had informally presented the SAC principle and showed that using it to prune patterns enhances the repeatability of their discovery in the same clinical domain. Here, we define formally the semantics of three SAC variations, and compare the use of the set of pruned patterns to the use of the complete set of discovered patterns, as features for classification and prediction tasks in three different medical domains. We induced four classifiers for each task, using four machine-learning methods: Random Forests, Naive Bayes, SVM, and Logistic Regression. The features were frequent temporal patterns discovered in each data set. SAC-based temporal pattern-discovery reduced by up to 97 of discovered patterns and by up to 98 classification and prediction performance of the reduced SAC-based pattern-based features set, was as good as when using the complete set. Using SAC can significantly reduce the number of discovered frequent interval-based temporal patterns, and the corresponding computational effort, without losing classification or prediction performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2019

FIBS: A Generic Framework for Classifying Interval-based Temporal Sequences

We study the problem of classification of interval-based temporal sequen...
research
05/12/2021

Frequent Pattern Mining in Continuous-time Temporal Networks

Networks are used as highly expressive tools in different disciplines. I...
research
07/26/2023

A new algorithm for Subgroup Set Discovery based on Information Gain

Pattern discovery is a machine learning technique that aims to find sets...
research
03/13/2010

The role of semantics in mining frequent patterns from knowledge bases in description logics with rules

We propose a new method for mining frequent patterns in a language that ...
research
06/19/2023

Efficient Generalized Temporal Pattern Mining in Big Time Series Using Mutual Information

Big time series are increasingly available from an ever wider range of I...
research
12/31/2015

Event Specific Multimodal Pattern Mining with Image-Caption Pairs

In this paper we describe a novel framework and algorithms for discoveri...
research
03/04/2023

Demystifying What Code Summarization Models Learned

Study patterns that models have learned has long been a focus of pattern...

Please sign up or login with your details

Forgot password? Click here to reset