Finding Good Itemsets by Packing Data

02/06/2019
by   Nikolaj Tatti, et al.
0

The problem of selecting small groups of itemsets that represent the data well has recently gained a lot of attention. We approach the problem by searching for the itemsets that compress the data efficiently. As a compression technique we use decision trees combined with a refined version of MDL. More formally, assuming that the items are ordered, we create a decision tree for each item that may only depend on the previous items. Our approach allows us to find complex interactions between the attributes, not just co-occurrences of 1s. Further, we present a link between the itemsets and the decision trees and use this link to export the itemsets from the decision trees. In this paper we present two algorithms. The first one is a simple greedy approach that builds a family of itemsets directly from data. The second one, given a collection of candidate itemsets, selects a small subset of these itemsets. Our experiments show that these approaches result in compact and high quality descriptions of the data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2023

Construction of Decision Trees and Acyclic Decision Graphs from Decision Rule Systems

Decision trees and systems of decision rules are widely used as classifi...
research
06/04/2019

A Novel Hyperparameter-free Approach to Decision Tree Construction that Avoids Overfitting by Design

Decision trees are an extremely popular machine learning technique. Unfo...
research
03/16/2022

Greedy Algorithms for Decision Trees with Hypotheses

We investigate at decision trees that incorporate both traditional queri...
research
12/07/2017

End-to-end Learning of Deterministic Decision Trees

Conventional decision trees have a number of favorable properties, inclu...
research
08/31/2022

Rethinking Conversational Recommendations: Is Decision Tree All You Need?

Conversational recommender systems (CRS) dynamically obtain the user pre...
research
02/24/2022

Interfering Paths in Decision Trees: A Note on Deodata Predictors

A technique for improving the prediction accuracy of decision trees is p...
research
07/18/2019

A discriminative approach for finding and characterizing positivity violations using decision trees

The assumption of positivity in causal inference (also known as common s...

Please sign up or login with your details

Forgot password? Click here to reset