Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

by   José L. Balcázar, et al.

Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.


page 1

page 2

page 3

page 4


Formal and Computational Properties of the Confidence Boost of Association Rules

Some existing notions of redundancy among association rules allow for a ...

Finding Maximal Non-Redundant Association Rules in Tennis Data

The concept of association rules is well–known in data mining. But often...

On sets of graded attribute implications with witnessed non-redundancy

We study properties of particular non-redundant sets of if-then rules de...

Closed-set-based Discovery of Bases of Association Rules

The output of an association rule miner is often huge in practice. This ...

The Bases of Association Rules of High Confidence

We develop a new approach for distributed computing of the association r...

Contributions to the Formalization and Extraction of Generic Bases of Association Rules

In this thesis, a detailed study shows that closed itemsets and minimal ...

Mining Feature Relationships in Data

When faced with a new dataset, most practitioners begin by performing ex...

Please sign up or login with your details

Forgot password? Click here to reset