## Annotated Bibliography on Association Rule Mining by Michael Hahsler

> Research on Association RulesContains 100 references

Latest update: Fri Jan 6 10:34:45 2017

This page contains my annotated bibliography on Association Rules.
The annotation is kept very concise and only serves as a guide to
what paper one should read to find information on a certain aspect
of association rule mining. The bibliography is not complete or
comprehensive, but it should cover the milestones in mainstream
research efforts. The bibliography is organized as follows:
Latest update: Fri Jan 6 10:34:45 2017

**Mining association rules (using support and confidence):**Algorithms which directly find frequent itemsets.**Alternative interest measures (besides confidence):**Other measures of interest to remedy shortcomings of the support-confidence framework.**Mining rules without or with a variable support threshold:**Techniques which deal with the shortcomings of support (e.g.,rare item problem).**Constraint-based mining:**Algorithms which use additional constraints (e.g., presence of an item or max number of rules) to reduce the search space.**Mining sequential, generalized, quantitative or causal rules:**Deals with special types of rules.**Concise representations of frequent itemsets (closed, maximal, ect.):**Focuses on closed and maximal frequent itemsets which are typically by orders of magnitude fewer itemsets than all frequent itemsets. However, all frequent itemsets can be induced from these itemsets and thus algorithms mining closed and maximal frequent itemsets are often more efficient.**Using association rules for classification:**Association rules are used to discover classification rules or to build classifiers.**Evolution of association rules over time:**Methods to find changes in data by analyzing changes in association rules found in the data (typically by splitting the data set into several parts and comparing the association rules found in the individual parts).**Theoretic considerations and sampling:**Discusses theoretic issues of rule mining (e.g., properties of lattices and measures, sampling, clustering).**Evaluation and efficient implementation of rule mining algorithms:**Deals with the comparison of algorithms, generation of synthetic data and the implementation issues of mining algorithms.

## Mining association rules (using support and confidence)

- [AIS93]
- R. Agrawal, T. Imielinski, and A. Swami. Mining
association rules between sets of items in large databases. In
*Proceedings of the ACM SIGMOD International Conference on Management of Data*, pages 207--216, Washington D.C., May 1993. [ bib | find paper in Google Scholar | find paper in Google ]Introduces association rules and the SUPPORT-CONFIDENCE framework and an algorithm to mine large itemsets. The algorithm is sometimes called AIS after the authors initials.

- [MTV94]
- Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo.
Efficient algorithms for discovering association rules. In
Usama M. Fayyad and Ramasamy Uthurusamy, editors,
*AAAI Workshop on Knowledge Discovery in Databases (KDD-94)*, pages 181--192, Seattle, Washington, 1994. AAAI Press. [ bib | find paper in Google Scholar | find paper in Google ]Develop similar improvements to the candidate generation as APRIORI. Itemsets with support are called covering sets. The paper also introduces sampling from the database and gives bounds for the resulting estimate of support.

- [AS94]
- Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for
mining association rules in large databases. In Jorge B.
Bocca, Matthias Jarke, and Carlo Zaniolo, editors,
*Proceedings of the 20th International Conference on Very Large Data Bases, VLDB*, pages 487--499, Santiago, Chile, September 1994. [ bib | find paper in Google Scholar | find paper in Google ]Introduction of the APRIORI algorithm (the best-known algorithm; it uses a breadth-first search strategy to counting the support of itemsets). The algorithm uses an improved candidate generation function which exploits the downward closure property of support and makes it more efficient than AIS. Also an algorithm to generate synthetic transaction data is presented. Such synthetic transaction data are widely used for the evaluation and comparison of new algorithms.

- [SON95]
- Ashok Savasere, Edward Omiecinski, and Shamkant Navathe. An
efficient algorithm for mining association rules in large
databases. In
*Proceedings of the 21st VLDB Conference*, pages 432--443, Zurich, Switzerland, 1995. [ bib | find paper in Google Scholar | find paper in Google ]Introduction of the PARTITION algorithm. The database is scanned only twice. For the first scan the DB is partitioned and in each partition support is counted. Then the counts are merged to generate potential large itemsets. In the second scan the potential large itemsets are counted to find the actual large itemsets.

- [Toi96]
- Hannu Toivonen. Sampling large databases for association rules.
In
*VLDB '96: Proceedings of the 22th International Conference on Very Large Data Bases*, pages 134--145, San Francisco, CA, USA, 1996. Morgan Kaufmann Publishers Inc. [ bib | find paper in Google Scholar | find paper in Google ]Find frequent itemsets in a random sample of a database (that fits into main memory) and then verify the found frequent itemsets in the database.

- [HGN00]
- Jochen Hipp, Ulrich Güntzer, and Gholamreza Nakhaeizadeh.
Algorithms for association rule mining -- A general survey and
comparison.
*SIGKDD Explorations*, 2(2):1--58, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Describes the fundamentals of association rule mining and presents an systematization of existing algorithms.

- [Zak00]
- Mohammed J. Zaki. Scalable algorithms for association
mining.
*IEEE Transactions on Knowledge and Data Engineering*, 12(3):372--390, May/June 2000. [ bib | find paper in Google Scholar | find paper in Google ]Introduces six new algorithms combining several features (database format, the decomposition technique, and the search procedure). Includes Eclat (Equivalence CLAss Transformation), MaxEclat, Clique, MaxClique, TopDown, and AprClique. ECLAT is a well known depth-first search algorithm using set intersection.

- [OLP
^{+}03] - Salvatore Orlando, Claudio Lucchese, Paolo Palmerini, Raffaele
Perego, and Fabrizio Silvestri. kdci: a multi-strategy algorithm
for mining frequent sets. In Bart Goethals and Mohammed J.
Zaki, editors,
*FIMI'03: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations*, November 2003. [ bib | find paper in Google Scholar | find paper in Google ]Introduces the kDCI algorithm.

- [HPYM04]
- Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. Mining
frequent patterns without candidate generation.
*Data Mining and Knowledge Discovery*, 8:53--87, 2004. [ bib | find paper in Google Scholar | find paper in Google ]Describes the data mining method FP-growth (frequent pattern growth) which uses an extended prefix-tree (FP-tree) structure to store the database in a compressed form. FP-growth adopts a divide-and-conquer approach to decompose both the mining tasks and the databases. It uses a pattern fragment growth method to avoid the costly process of candidate generation and testing.

- [CGL04]
- Frans Coenen, Graham Goulbourne, and Paul Leng. Tree structures
for mining association rules.
*Data Mining and Knowledge Discovery*, 8:25--51, 2004. [ bib | find paper in Google Scholar | find paper in Google ]Describes how to compute PARTIAL SUPPORT COUNTS in one DB-pass and how to store them in an enumeration tree (P-Tree).

- [HCXY07]
- J. Han, H. Cheng, D. Xin, and X. Yan.
Frequent pattern mining: Current status and future directions.
*Data Mining and Knowledge Discovery*, 14(1), 2007. [ bib | find paper in Google Scholar | find paper in Google ]Complete overview of the state-of-the art in frequent patten mining and identifies future research directions.

## Alternative interest measures (besides confidence)

- [PS91]
- G. Piatetsky-Shapiro. Discovery, analysis, and
presentation of strong rules. In G. Piatetsky-Shapiro and W.J.
Frawley, editors,
*Knowledge Discovery in Databases*. AAAI/MIT Press, Cambridge, MA, 1991. [ bib | find paper in Google Scholar | find paper in Google ]Introduces the measure LEVERAGE which is the simplest function which satisfies his principles for rule-interest functions (0 if the variables are statistically independent; monotonically increasing if the variables occur more often together; monotonically decreasing if one of the variables alone occurs more often).

- [BMUT97]
- Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, and Shalom
Tsur. Dynamic itemset counting and implication rules for market
basket data. In
*SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data*, pages 255--264, Tucson, Arizona, USA, May 1997. [ bib | find paper in Google Scholar | find paper in Google ]Introduces CONVICTION (as an improvement to confidence based on implication rules) and INTEREST (later called LIFT).

- [AY98]
- C. C. Aggarwal and P. S. Yu. A new framework for
itemset generation. In
*PODS 98, Symposium on Principles of Database Systems*, pages 18--24, Seattle, WA, USA, 1998. [ bib | find paper in Google Scholar | find paper in Google ]Points out weaknesses of the large frequent itemset method using support (spuriousness, dense datasets) and that lift gives only values close to one for items which are very frequent, even if they are perfectly positive correlated. COLLECTIVE STRENGTH is introduced. Collective strength uses the violation rate for an itemset which is the fraction of transactions which contains some, but not all items of the itemset. The violation rate is compared to the expected violation rate under independence. Collective strength is downward closed.

- [LHM99]
- Bing Liu, Wynne Hsu, and Yiming Ma. Pruning and summarizing the
discovered associations. In
*Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD-99)*, pages 125--134. ACM Press, 1999. [ bib | find paper in Google Scholar | find paper in Google ]Remove insignificant rules using the chi-square test to test for correlation between the antecedent and the confident of a rule. Also DIRECTION SETTING (DS) RULES are introduced. A DS rule has a pos. correlated antecedent and consequent and is not built from a rule with a shorter antecedent which is a DS rule. Normally, only a small and concise fraction of rules are DS rules.

- [BD01]
- Dario Bruzzese and Cristina Davino. Pruning of discovered
association rules.
*Computational Statistics*, 16:387--398, 2001. [ bib | find paper in Google Scholar | find paper in Google ]The authors construct several statistical tests to evaluate the significance of discovered associations.

- [BH03]
- Brock Barber and Howard J. Hamilton. Extracting share
frequent itemsets with infrequent subsets.
*Data Mining and Knowledge Discovery*, 7:153--185, 2003. [ bib | find paper in Google Scholar | find paper in Google ]ITEMSET SHARE is the fraction of some measure (e.g., sales, profit) contributed by the items in the set. A itemset is share frequent if it exceeds a threshold. Share frequency is not downward closed! The article presents several algorithms and heuristics to mine share frequent itemsets.

- [TKS04]
- Pang-Ning Tan, Vipin Kumar, and Jaideep Srivastava. Selecting
the right objective measure for association analysis.
*Information Systems*, 29(4):293--313, 2004. [ bib | find paper in Google Scholar | find paper in Google ]Compare the properties of 21 objective measures (of interest). The measures in general lack to agree with each other. However, the authors show that if support-based pruning or table standardization (of the contingency tables) is used, the measures become highly correlated.

- [Sch05]
- Tobias Scheffer. Finding association rules that trade support
optimally against confidence.
*Intelligent Data Analysis*, 9(4):381--395, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Introduces predictive accuracy which is the expected value of the confidence of a rules with respect to the process underlying the database. The author shows how predictive accuracy can be calculated from confidence and support measured on a data set using a Bayesian frequency correction (very simplified: confidence is discounted for rules with low supports). Also an algorithm is presented which finds the top n most predictive association rules (redundant rules with a 0 predictive accuracy improvement are removed) and shows how to estimate the prior distribution needed for the correction.

- [BGBG05]
- Julien Blanchard, Fabrice Guillet, Henri Briand, and Regis
Gras. Assessing rule interestingness with a probabilistic measure
of deviation from equilibrium. In
*Proceedings of the 11th international symposium on Applied Stochastic Models and Data Analysis ASMDA-2005*, pages 191--200. ENST, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Presents a statistical test for the deviation from the equilibrium of a rule. The equilibrium for rule a -> b is defined as: the number of transactions which contain a and b together is equal to the number of transactions which contain a and not b.

- [GH06]
- Liqiang Geng and Howard J. Hamilton. Interestingness
measures for data mining: A survey.
*ACM Computing Surveys*, 38(3):9, 2006. [ bib | find paper in Google Scholar | find paper in Google ] - [Li06]
- Jiuyong Li. On optimal rule discovery.
*IEEE Transactions on Knowledge and Data Engineering*, 18(4):460--471, 2006. [ bib | find paper in Google Scholar | find paper in Google ]An optimal rule set (with respect to a metric of interestingness) contains all rules except those with no greater interestingness than one of its more general rules. An optimal rule set is a subset of a nonredundant rule set. The autors present an algorithm called ORD to find an optimal rule set. Classifiers build on optimal class association rules are at least as accurate as those built from CBA and C4.5 rule.

- [HH07]
- Michael Hahsler and Kurt Hornik. New probabilistic interest
measures for association rules.
*Intelligent Data Analysis*, 11(5):437--455, 2007. [ bib | find paper in Google Scholar | find paper in Google ]Presents a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. Uses such data and a real-world grocery database to explore the behavior of confidence and lift, two popular interest measures used for rule mining. Also introduces the new probabilistic measures hyper-lift and hyper-confidence.

- [WCH10]
- Tianyi Wu, Yuguo Chen, and Jiawei Han. Re-examination of
interestingness measures in pattern mining: a unified framework.
*Data Mining and Knowledge Discovery*, January 2010. [ bib | DOI | find paper in Google Scholar | find paper in Google ]Re-examines a set of null-invariant interestingness measures (AllConf, Coherence, Cosine, Kulc, MaxConf) and shows that they can be expressed as the generalized mathematical mean, leading to a total ordering of them. Also proposes a new measure called Imbalance Ratio.

## Mining rules without or with a variable minimum support threshold

- [BMS97]
- Sergey Brin, Rajeev Motwani, and Craig Silverstein. Beyond
market baskets: Generalizing association rules to correlations. In
*SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data*, pages 265--276, Tucson, Arizona, USA, May 1997. [ bib | find paper in Google Scholar | find paper in Google ]Proposes to use the chi-square test for correlation. For an itemset of length l, the test is carried out on a l-dimensional contingency tables. A problem is cells with low counts and multiple tests.

- [SBM98]
- Craig Silverstein, Sergey Brin, and Rajeev Motwani. Beyond
market baskets: Generalizing association rules to dependence rules.
*Data Mining and Knowledge Discovery*, 2:39--68, 1998. [ bib | find paper in Google Scholar | find paper in Google ]Journal version of Brin et al. (1997).

- [LHM99]
- Bing Liu, Wynne Hsu, and Yiming Ma. Mining association rules
with multiple minimum supports. In
*Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD-99)*, pages 337--341. ACM Press, 1999. [ bib | find paper in Google Scholar | find paper in Google ]Adapts APRIORI to work with different minimum support thresholds assigned to different items (minimum item supports, MIS). To preserve the downward closure property of support item sorting using the MIS values is used.

- [LZD
^{+}99] - Jinyan Li, Xiuzhen Zhang, Guozho Dong, Kotagiri Ramamohanarao,
and Qun Sun. Efficient mining of high confidence association rules
without support thresholds. In J. Zytkow and J. Rauch,
editors,
*Principles of Data Mining and Knowledge Discovery PKDD'99, LNAI 1704, Prague, Czech Republic*, pages 406--411. Springer-Verlag, 1999. [ bib | find paper in Google Scholar | find paper in Google ]This paper used JUMPING EMERGING PATTERNS to mine a border for top rules (rules with 100% confidence) for a given consequent. The drawbacks are that only one consequent is mined at a time and that finding rules with other than 100% confidence is difficult.

- [AEMT00]
- Khalil M. Ahmed, Nagwa M. El-Makky, and Yousry Taha.
A note on ”Beyond market baskets: Generalizing association rules to
correlations”.
*SIGKDD Explorations*, 1(2):46--48, 2000. [ bib | find paper in Google Scholar | find paper in Google ]A reply to Brin et al. (1997). The authors state that the chi-square test tests the whole contingency table, but for larger than 2x2 tables we want to test dependence for single cells.

- [WHC01]
- Ke Wang, Yu He, and David W. Cheung. Mining
confident rules without support requirement. In
*Proceedings of the tenth international conference on Information and knowledge management*, pages 89 -- 96, New York, NY, 2001. ACM Press. [ bib | find paper in Google Scholar | find paper in Google ]The paper shows that for data with categorical attributes a UNIVERSAL-EXISTENTIAL UPWARD CLOSURE exists for confidence. With this property algorithms with confidence-based pruning are possible that use a level-wise (from k to k-1) candidate generation are. The paper also discusses a disk-based implementation.

- [SK01]
- Masakazu Seno and George Karypis. Lpminer: An algorithm for
finding frequent itemsets using length decreasing support
constraint. In Nick Cercone, Tsau Young Lin, and Xindong Wu,
editors,
*Proceedings of the 2001 IEEE International Conference on Data Mining, 29 November -- 2 December 2001, San Jose, California, USA*, pages 505--512. IEEE Computer Society, 2001. [ bib | find paper in Google Scholar | find paper in Google ]To find longer frequent itemsets, the minimal support requirement decreases as a function of the itemset length. A algorithm based on the FP-tree is presented and a property called small valid extension (SVE) is introduced which makes mining efficient in absence of downward closure.

- [DP01]
- William DuMouchel and Daryl Pregibon. Empirical Bayes screening
for multi-item associations. In F. Provost and
R. Srikant, editors,
*Proceedings of the ACM SIGKDD Intentional Conference on Knowledge Discovery in Databases and Data Mining (KDD-01)*, pages 67--76. ACM Press, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Search for unusually frequent itemsets using statistical methods. First, the authors propose stratification of the data to avoid finding spurious associations within strata. Then the deviation of the observed frequency over a baseline frequency (based on independence) is used. Since the deviation is unreliable for low counts, an empirical Bayes model (its 95% confidence limit) is used to produce a posterior distribution of the true ratio of actual to baseline frequencies. The Bayes model gives ratios close to the observed ratios for large samples and reduces (shrinks) the ratio if the sample size gets small (to smooth away noise). For multi-item associations log-linear models are proposed to find higher order associations which cannot be explained by pairwise associations.

- [CDF
^{+}01] - Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis,
Piotr Indyk, Rajeev Motwani, Jeffrey D. Ullman, and Cheng
Yang. Finding interesting associations without support pruning.
*IEEE Transactions on Knowledge and Data Engineering*, 13(1):64--78, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Uses similarity measures between hashed values of rows in a transaction database. The approach in the paper was only shown for associations between two items.

- [TMF03]
- Feng Tao, Fionn Murtagh, and Mohsen Farid. Weighted association
rule mining using weighted support and significance framework. In
*Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003)*, Washington, DC, 2003. ACM Press. [ bib | find paper in Google Scholar | find paper in Google ]Uses attributes of the items (e.g., price, page dwelling time) to WEIGHT SUPPORT. A support and significance framework is presented which possesses a weighted downward closure property important for pruning the search space.

- [Omi03]
- Edward R. Omiecinski. Alternative interest measures for
mining associations in databases.
*IEEE Transactions on Knowledge and Data Engineering*, 15(1):57--69, Jan/Feb 2003. [ bib | find paper in Google Scholar | find paper in Google ]Omiecinski introduced several alternatives to support. The first measure, ANY-CONFIDENCE, is defined as the confidence of the rule with the largest confidence which can be generated from an itemset. The author states that although finding all itemsets with a set any-confidence would enable us to find all rules with a given minimum confidence, any-confidence cannot be used efficiently as a measure of interestingness since confidence is not downward closed. The second introduced measure is ALL-CONFIDENCE. This measure is defined as the smallest confidence of all rules which can be produced from an itemset, i.e., all rules produced form an itemset will have a confidence greater or equal to its all-confidence value. BOND, the last measure, is defined as the ratio of the number of transactions which contain all items of an itemset to the number of transactions which contain at least one of these items. Omiecinski showed that bond and all-confidence are downward closed and, therefore, can be used for efficient mining algorithms.

- [XTK03]
- Hui Xiong, Pang-Ning Tan, and Vipin Kumar. Mining strong
affinity association patterns in data sets with skewed support
distribution. In Bart Goethals and Mohammed J. Zaki, editors,
*Proceedings of the IEEE International Conference on Data Mining, November 19--22, 2003, Melbourne, Florida*, pages 387--394, November 2003. [ bib | find paper in Google Scholar | find paper in Google ]Support-based pruning strategies are not effective for data sets with skewed support distributions. The authors propose the concept of hyperclique pattern, which uses an objective measure called h-confidence (equal to all-confidence by Omiecinski, 2003) to identify strong affinity patterns. The generation of so-called cross-support patterns (patterns with items with substantially different support) is avoided by h-confidence's cross-support property.

- [SK05]
- Masakazu Seno and George Karypis. Finding frequent itemsets
using length-decreasing support constraint.
*Data Mining and Knowledge Discovery*, 10:197--228, 2005. [ bib | find paper in Google Scholar | find paper in Google ]See Seno and Karypis 2001.

- [Hah06]
- Michael Hahsler. A model-based frequency constraint for mining
associations from transaction data.
*Data Mining and Knowledge Discovery*, 13(2):137--166, September 2006. [ bib | DOI | find paper in Google Scholar | find paper in Google ]Develops a novel model-based frequency constraint as an alternative to a single, user-specified minimum support. The constraint utilizes knowledge of the process generating transaction data by applying a simple stochastic mixture model (the NB model) and uses a user-specified precision threshold to find local frequency thresholds for groups of itemsets (NB-frequent itemsets). The new constraint provides improvements over a single minimum support threshold and that the precision threshold is more robust and easier to set and interpret by the user.

## Constraint-based mining

- [KMR
^{+}94] - Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu
Toivonen, and A. Inkeri Verkamo. Finding interesting rules
from large sets of discovered association rules. In Nabil R.
Adam, Bharat K. Bhargava, and Yelena Yesha, editors,
*Third International Conference on Information and Knowledge Management (CIKM'94)*, pages 401--407. ACM Press, 1994. [ bib | find paper in Google Scholar | find paper in Google ]Introduce the usage of rule templates.

- [SVA97]
- Ramakrishnan Srikant, Quoc Vu, and Rakesh Agrawal. Mining
association rules with item constraints. In David Heckerman, Heikki
Mannila, Daryl Pregibon, and Ramasamy Uthurusamy, editors,
*Proceedings of the 3rd International Conference Knowledge Discovery and Data Mining (KDD-97)*, pages 67--73. AAAI Press, 1997. [ bib | find paper in Google Scholar | find paper in Google ]Integrates BOOLEAN CONSTRAINTS on items (absence, presence) into the mining algorithm to reduce the search space. Algorithms are discussed.

- [NLHP98]
- Raymond T. Ng, Laks V.S. Lakshmanan, Jiawei Han, and
Alex Pang. Exploratory mining and pruning optimizations of
constrained associations rules. In
*Proceedings of the ACM SIGMOD Conference, Seattle, WA*, pages 13--24, 1998. [ bib | find paper in Google Scholar | find paper in Google ]Characterizes various constraints (contains, minimum, maximum, count, sum, avg) according to anti-monotonicity and succinctness. Anti-monotonicity is the property which allows iterative pruning (generate and test candidates) used e.g., on support by Apriori. Succinctness is a property that enables us to generate only those itemsets which satisfy the constraint without the need to test them.

- [BAG00]
- R. Bayardo, R. Agrawal, and D. Gunopulos.
Constraint-based rule mining in large, dense databases.
*Data Mining and Knowledge Discovery*, 4(2/3):217--240, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Introduces the MINIMUM IMPROVEMENT constraint for confidence (mine only rules with a confidence which is minimp greater than the confidence of any of its proper subset-rules). DenseMiner, an algorithm that enforces minimum support, minimum confidence and minimum improvement already during a breadth-first search for all rules for a given consequent C is presented.

- [PHL01]
- Jian Pei, Jiawei Han, and Laks V.S. Lakshmanan. Mining
frequent itemsets with convertible constraints. In
*Proceedings of the 17th International Conference on Data Engineering, April 2--6, 2001, Heidelberg, Germany*, pages 433--442, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Develops a technique of how constraints on avg, median and sum can be converted so that they can be used already during the search phase of the FP-growth algorithm. The constraints are classified into constraints that are: convertible anti-monotone, convertible monotone and strongly convertible.

- [WZ05]
- Geoffrey I. Webb and Songmao S. Zhang.
k-optimal-rule-discovery.
*Data Mining and Knowledge Discovery*, 10(1):39--79, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Develops GRD (based on the OPUS search strategy) which discovers all rules satisfying a set of constraints (max. number of rules, min support, min confidence, max coverage, max leverage) in a depth-first search. (An early draft of the paper was called: Beyond association rules: Generalized rule discovery)

- [BGMP05]
- Francesco Bonchi, Fosca Giannotti, Alessio Mazzanti, and Dino
Pedreschi. ExAnte: A preprocessing method for frequent-pattern
mining.
*IEEE Intelligent Systems*, 20(3):25--31, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Reduces the database size before mining by iteratively applying mu-reduction and alpha-reduction. Mu-reduction removes transactions which do not meet monotone constraints. Alpha-reduction remove infrequent items from the transactions.

- [BL06]
- Francesco Bonchi and Claudio Lucchese. On condensed
representations of constrained frequent patterns.
*Knowledge and Information Systems*, 9(2):180--201, 2006. [ bib | find paper in Google Scholar | find paper in Google ]Presents an algorithm to efficiently mine closed and constrained frequent itemsets.

## Mining sequential, generalized, quantitative or causal rules

- [SA95]
- Ramakrishnan Srikant and Rakesh Agrawal. Mining generalized
association rules. In
*Proceedings of the 21st VLDB Conference, Zurich, Switzerland*, 1995. [ bib | find paper in Google Scholar | find paper in Google ]Generalized association rules use a taxonomy (is-a hierarchy) on items. The paper introduces R-interesting rules as rules with a support which is R-times higher than the support of its closest ancestor (a rule with at leased on item generalized). Algorithms that use R-interesting in addition to support and confidence are presented and evaluated.

- [AS95]
- Rakesh Agrawal and Ramakrishnan Srikant. Mining sequential
patterns. In Philip S. Yu and Arbee S. P. Chen, editors,
*Eleventh International Conference on Data Engineering*, pages 3--14, Taipei, Taiwan, 1995. IEEE Computer Society Press. [ bib | find paper in Google Scholar | find paper in Google ]Introduces mining sequential patterns. A sequential pattern is a maximal sequence that exceeds minimum support (a minimum number of customers). The algorithms AprioriSome and AprioryAll (based on Apriori) are presented.

- [FMMT96]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and
Takeshi Tokuyama. Mining optimized association rules for numeric
attributes. In
*PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems*, pages 182--191. ACM Press, 1996. [ bib | find paper in Google Scholar | find paper in Google ]Finds appropriate ranges for quantitative attributes automatically by maximizing the support on the condition that the confidence ratio is at least a given threshold value or by maximizing the confidence ratio on the condition that the support is at least a given threshold number. The paper also introduces the measure gain: gain(R) = sup(R) - minConf * sup(lhs(R)) = sup(R) * (conf(R) - minConf).

- [MTV97]
- Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo.
Discovery of frequent episodes in event sequences.
*Data Mining and Knowledge Discovery*, 1(3):259--289, 1997. [ bib | find paper in Google Scholar | find paper in Google ] - [LSW97]
- Brian Lent, Arun N. Swami, and Jennifer Widom. Clustering
association rules. In
*Proceedings of the Thirteenth International Conference on Data Engineering, April 7--11, 1997 Birmingham U.K.*, pages 220--231. IEEE Computer Society, 1997. [ bib | find paper in Google Scholar | find paper in Google ]Join adjacent intervals for quantitative association rules to produce more general rules.

- [SBMU00]
- Craig Silverstein, Sergey Brin, Rajeev Motwani, and
Jeffrey D. Ullman. Scalable techniques for mining causal
structures.
*Data Mining and Knowledge Discovery*, 4(2/3):163--192, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Explores the applicability of constraint-based causal discovery (known from Bayesian learning) to discover causal relationships in market basket data.

- [Ada01]
- Jean-Marc Adamo.
*Data Mining for Association Rules and Sequential Patterns*. Springer, New York, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Introduction to association rules and mining sequential patterns.

- [LNWJ01]
- Yingjiu Li, Peng Ning, X. Sean Wang, and Sushil Jajodia.
Generating market basket data with temporal information. In
*ACM KDD Workshop on Temporal Data Mining*, August 2001. [ bib | find paper in Google Scholar | find paper in Google ]Develop a generator for synthetic data with temporal patterns based on the generator by Agrawal and Srikan (1994).

- [AL03]
- Y. Aumann and Y. Lindell. Statistical theory for
quantitative association rules.
*Journal of Intelligent Information Systems*, 20(3):255--283, 2003. [ bib | find paper in Google Scholar | find paper in Google ]Defines QUANTITATIVE ASSOCIATION RULES using statistical measures (e.g., mean and variance) of continuous data. Also algorithms are discussed.

- [HCXY07]
- J. Han, H. Cheng, D. Xin, and X. Yan.
Frequent pattern mining: Current status and future directions.
*Data Mining and Knowledge Discovery*, 14(1), 2007. [ bib | find paper in Google Scholar | find paper in Google ]Complete overview of the state-of-the art in frequent patten mining and identifies future research directions.

## Concise representations of frequent itemsets (closed, maximal, etc.)

- [MT96]
- Heikki Mannila and Hannu Toivonen. Multiple uses of frequent
sets and condensed representations. In
*Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96)*, pages 189--194. AAAI Press, 1996. [ bib | find paper in Google Scholar | find paper in Google ]Introduces general rules with disjunctions and negations in the antecedent and the consequent. The confidence of any such rules can be approximated by using the support of frequent itemsets only (applying the inclusion-exclusion principle). Using the negative border, an error bound for the estimates can be calculated. The authors also show that frequent itemsets with a support of epsilon are a concise representation (epsilon-adequate representation) which can approximate the confidence of any itemset with an error of at most epsilon.

- [ZPOL97]
- Mohammed J. Zaki, Srinivasan Parthasarathy, Mitsunori
Ogihara, and Wei Li. New algorithms for fast discovery of
association rules. Technical Report 651, Computer Science
Department, University of Rochester, Rochester, NY 14627, July
1997. [ bib |
find paper in Google Scholar |
find paper in Google ]
Quickly identify MAXIMAL FREQUENT ITEMSETS (a frequent itemset is maximal if it is no proper subset of any other frequent itemset) using different database layout schemes (regular, inverted) and clustering techniques (equivalence class ECLAT, max. clique). See also Zaki 2000, Scalable Algorithms for Association Mining.

- [PBTL99b]
- Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal.
Efficient mining of association rules using closed itemset
lattices.
*Information Systems*, 24(1):25--46, 1999. [ bib | find paper in Google Scholar | find paper in Google ]Present the CLOSE algorithm to mine frequent closed itemsets.

- [PBTL99a]
- Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal.
Discovering frequent closed itemsets for association rules. In
*Proceeding of the 7th International Conference on Database Theory, Lecture Notes In Computer Science (LNCS 1540)*, pages 398--416. Springer, 1999. [ bib | find paper in Google Scholar | find paper in Google ]Introduces CLOSED ITEMSETS. An itemset X is closed if no proper super set of X is contained in every transaction in which X is contained. Which means there exists no super set of X with the same support count as X.

- [PHM00]
- Jian Pei, Jiawei Han, and Runying Mao. CLOSET: An efficient
algorithm for mining frequent closed itemsets. In
*ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery*, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Introduces the algorithm CLOSET which mines frequent closed itemsets using FP-growth (a depth-first search using support counting).

- [BTP
^{+}00] - Y. Bastide, R. Taouil, N. Pasquier,
G. Stumme, and L. Lakhai. Mining frequent patterns with
counting inference.
*SIGKDD Explorations*, 2(2):66--75, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Proposes the algorithm PASCAL (a APRIORI optimization) to mine closed and frequent items. This approach uses frequent key-patterns to infer counts of frequent non-key patterns.

- [AAP00]
- Rakesh C. Agrawal, Charu C. Aggarwal, and
V. V. V. Prasad. Depth first generation of long patterns.
In
*Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000)*, pages 108--118, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Introduces the algorithm DepthProject which builds a lexicographic tree in a depth first order.

- [BCG01]
- Douglas Burdick, Manuel Calimlim, and Johannes Gehrke. MAFIA: A
maximal frequent itemset algorithm for transactional databases. In
*Proceedings of the 17th International Conference on Data Engineering*, pages 443--452, Washington, DC, 2001. IEEE Computer Society. [ bib | find paper in Google Scholar | find paper in Google ]MAFIA (MAximal Frequent Itemset Algorithm) finds maximal itemsets using a depth-first traversal of the itemset lattice, a compressed vertical bitmap representation of the database, additional pruning techniques (Parent Equivalence Pruning, Frequent Head Union Tail pruning) and dynamic reordering. The authors claim that MAFIA outperforms DepthProject (Agrawal et al., 2001) by a factor of 3 to 5 on average.

- [ZH02]
- Mohammed J. Zaki and Ching-Jiu Hsiao. CHARM: An efficient
algorithm for closed itemset mining. In
*Proceedings of the Second SIAM International Conference on Data Mining*, Arlington, VA, 2002. SIAM. [ bib | find paper in Google Scholar | find paper in Google ]The algorithm CHARM enumerates all frequent closed itemsets and uses a number of improvements: (a) It uses a IT-tree (itemset-tidset tree based on equivalence classes) to search simultaneously the itemset space and the transaction space. (b) It uses a fast hash-based elimination of non-closed itemsets. (c) It uses diffsets which represents the database in a compact way which should fit into main memory. (d) It uses efficient intersection operations. The performance testing shows that CHARM can provide significant improvement over algorithms as Apriori, Close, Pascal, Mafia, and Closet.

- [CG02]
- Toon Calders and Bart Goethals. Mining all non-derivable
frequent itemsets. In Tapio Elomaa, Heikki Mannila, and Hannu
Toivonen, editors,
*Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery*, volume 2431 of*Lecture Notes in Computer Science*, pages 74--85. Springer-Verlag, 2002. [ bib | find paper in Google Scholar | find paper in Google ]Introduce NON-DERIVABLE ITEMSETS (NDIs). The support of all frequent NDIs allows for computing the support of all frequent itemsets using deduction rules based on the inclusion-exclusion principle.

- [BBR03]
- Jean-Francois Boulicaut, Artur Bykowski, and Christophe
Rigotti. Free-sets: A condensed representation of boolean data for
the approximation of frequency queries.
*Data Mining and Knowledge Discovery*, 7(1):5--22, 2003. [ bib | find paper in Google Scholar | find paper in Google ]Presents a new epsilon-adequate representation for frequent itemsets called frequent FREE-SETS. An itemset is a free-set if it has no subset with (almost) the same support thus the items in the itemset cannot be used to form a (nearly) exact rule.

- [Zak04]
- Mohammed Zaki. Mining non-redundant association rules.
*Data Mining and Knowledge Discovery*, 9:223--248, 2004. [ bib | find paper in Google Scholar | find paper in Google ]Compares frequent itemsets and frequent closed itemsets and shows that frequent closed itemsets can be used to generate NON-REDUNDANT association rules. Non-Redundant rules are a set of rules with the most general rules (smallest antecedent and consequent) without loss of information.

- [ZH05]
- Mohammed Zaki and Ching-Jui Hsiao. Efficient algorithms for
mining closed itemsets and their lattice structure.
*IEEE Transactions on Knowledge and Data Engineering*, 17(4):462--478, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Describes the algorithm CHARM.

- [GZ05]
- Karam Gouda and Mohammed J. Zaki. GenMax: An efficient
algorithm for mining maximal frequent itemsets.
*Data Mining and Knowledge Discovery*, 11:1--20, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Presents a backtrack search based algorithm for mining maximal frequent itemsets. Uses: progressive focusing for maximality checking, and diffset propagation for frequency computation.

- [BL06]
- Francesco Bonchi and Claudio Lucchese. On condensed
representations of constrained frequent patterns.
*Knowledge and Information Systems*, 9(2):180--201, 2006. [ bib | find paper in Google Scholar | find paper in Google ]Presents an algorithm to efficiently mine closed and constrained frequent itemsets.

- [CRB06]
- Toon Calders, Christophe Rigotti, and Jean-Francois Boulicaut.
A survey on condensed representations for frequent sets. In
Jean-Francois Boulicaut, Luc Raedt, and Heikki Mannila, editors,
*Constraint-Based Mining and Inductive Databases: European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March 11-13, 2004, Revised Selected Papers*, volume 3848 of*Lecture Notes in Computer Science*, pages 64--80, February 2006. [ bib | find paper in Google Scholar | find paper in Google ] - [HCXY07]
- J. Han, H. Cheng, D. Xin, and X. Yan.
Frequent pattern mining: Current status and future directions.
*Data Mining and Knowledge Discovery*, 14(1), 2007. [ bib | find paper in Google Scholar | find paper in Google ]Complete overview of the state-of-the art in frequent patten mining and identifies future research directions.

## Using association rules for classification

- [LHM98]
- Bing Liu, Wynne Hsu, and Yiming Ma. Integrating classification
and association rule mining. In
*Proceedings of the 4rd International Conference Knowledge Discovery and Data Mining (KDD-98)*, pages 80--86. AAAI Press, 1998. [ bib | find paper in Google Scholar | find paper in Google ]Mines only the subset of association rules with the classification class attribute in the right-hand-site (CARs). From these CARs a classifier is built by using the rules with the highest confidence to cover the whole database. The presented alorithm is called Classification Based on Associations (CBA). In ecperiment the resulting classifiers are more accurate than C4.5.

- [Fre00]
- Alex A. Freitas. Understanding the crucial differences
between classification and discovery of association rules -- a
position paper.
*SIGKDD Explorations*, 2(1):65--69, 2000. [ bib | find paper in Google Scholar | find paper in Google ] - [Li06]
- Jiuyong Li. On optimal rule discovery.
*IEEE Transactions on Knowledge and Data Engineering*, 18(4):460--471, 2006. [ bib | find paper in Google Scholar | find paper in Google ]An optimal rule set (with respect to a metric of interestingness) contains all rules except those with no greater interestingness than one of its more general rules. An optimal rule set is a subset of a nonredundant rule set. The autors present an algorithm called ORD to find an optimal rule set. Classifiers build on optimal class association rules are at least as accurate as those built from CBA and C4.5 rule.

- [JHZ10]
- Mojdeh Jalali-Heravi and Osmar R. Zaïane. A study on
interestingness measures for associative classifiers. In
*Proceedings of the 2010 ACM Symposium on Applied Computing*, SAC '10, pages 1039--1046. ACM, 2010. [ bib | find paper in Google Scholar | find paper in Google ]Compares associative classifiers using 53 different objective measures for association rules.

## Evolution of association rules over time

- [SKK01]
- Hee Seok Song, Soung Hie Kim, and Jae Kyeong
Kim. A methodology for detecting the change of customer behavior
based on association rule mining. In
*Proceedings of the Pacific Asia Conference on Information System*, pages 871--885. PACIS, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Develops a methodology to detect changes of customer behavior automatically by comparing association rules between different time snapshots of data. Defines emerging pattern, unexpected change and the added/perished rule based on similarity and difference measures for rule matching.

## Theoretic considerations, sampling and clustering

- [MTV94]
- Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo.
Efficient algorithms for discovering association rules. In
Usama M. Fayyad and Ramasamy Uthurusamy, editors,
*AAAI Workshop on Knowledge Discovery in Databases (KDD-94)*, pages 181--192, Seattle, Washington, 1994. AAAI Press. [ bib | find paper in Google Scholar | find paper in Google ]Develop similar improvements to the candidate generation as APRIORI. Itemsets with support are called covering sets. The paper also introduces sampling from the database and gives bounds for the resulting estimate of support.

- [Toi96]
- Hannu Toivonen. Sampling large databases for association rules.
In
*VLDB '96: Proceedings of the 22th International Conference on Very Large Data Bases*, pages 134--145, San Francisco, CA, USA, 1996. Morgan Kaufmann Publishers Inc. [ bib | find paper in Google Scholar | find paper in Google ]Find frequent itemsets in a random sample of a database (that fits into main memory) and then verify the found frequent itemsets in the database.

- [ZPLO97]
- Mohammed Javeed Zaki, Srinivasan Parthasarathy, Wei Li,
and Mitsunori Ogihara. Evaluation of sampling for data mining of
association rules. In
*Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications*, pages 42--50. IEEE Computer Society, 1997. [ bib | find paper in Google Scholar | find paper in Google ]Evaluates random sampling with replacement as presented in Manila et al. 1994 using several datasets. The experiments show that Chernoff bounds overestimate the needed sample size and that sampling seems an effective tool for practical purposes.

- [LSW97]
- Brian Lent, Arun N. Swami, and Jennifer Widom. Clustering
association rules. In
*Proceedings of the Thirteenth International Conference on Data Engineering, April 7--11, 1997 Birmingham U.K.*, pages 220--231. IEEE Computer Society, 1997. [ bib | find paper in Google Scholar | find paper in Google ]Join adjacent intervals for quantitative association rules to produce more general rules.

- [ZO98]
- M. J. Zaki and M. Ogihara. Theoretical foundation of
association rules. In
*SIGMOD'98 Workshop on Research Issues in Data Mining and Knowledge Discovery (SIGMOD-DMKD'98), Seattle, Friday, June 5, 1998*, 1998. [ bib | find paper in Google Scholar | find paper in Google ]Presents the lattice-theoretic foundations of mining associations based on FORMAL CONCEPT ANALYSIS and shows that frequent itemsets are determined by the set of frequent concepts. The paper studies the generation of a minimal set of rules (called base) can be generated from which all other association rules can be inferred. The paper also presents some complexity considerations using the connection between frequent itemsets and maximal bipartite cliques. It is shown that for very sparse databases association rule algorithms should scale linearly in the number of items.

- [MS98]
- Nimrod Megiddo and Ramakrishnan Srikant. Discovering predictive
association rules. In Rakesh Agrawal, Paul E. Stolorz, and
Gregory Piatetsky-Shapiro, editors,
*Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98)*, pages 274--278. AAAI Press, 1998. [ bib | find paper in Google Scholar | find paper in Google ]Introduces several STATISTICAL TESTS: Test if the observed support count is sig. greater than a support threshold, Chi-square test of independence (see also Brin et al. 1997). Also deals with the Bonferroni effect (multiple-comparison problem) by finding an upper bound of the number of tested hypotheses and proposing a resampling procedure using an independence model. The paper introduced confidence intervals for support and confidence. Finally, the authors find that the support-confidence framework does a good job to eliminate statistically insignificant rules (on market basket data).

- [LHM99]
- Bing Liu, Wynne Hsu, and Yiming Ma. Pruning and summarizing the
discovered associations. In
*Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD-99)*, pages 125--134. ACM Press, 1999. [ bib | find paper in Google Scholar | find paper in Google ]Remove insignificant rules using the chi-square test to test for correlation between the antecedent and the confident of a rule. Also DIRECTION SETTING (DS) RULES are introduced. A DS rule has a pos. correlated antecedent and consequent and is not built from a rule with a shorter antecedent which is a DS rule. Normally, only a small and concise fraction of rules are DS rules.

- [BA99]
- Robert J. Bayardo Jr. and Rakesh Agrawal. Mining the most
interesting rules. In
Shows that for all rules with the same antecedent, the best (optimal, most interesting) rules according to measures as confidence, support, gain, chi-square value, gini, entropy gain, laplace, lift, conviction all must reside along a support/confidence border. The paper also shows that many measures are monotone functions of support and confidence.

- [DP01]
- William DuMouchel and Daryl Pregibon. Empirical Bayes screening
for multi-item associations. In F. Provost and
R. Srikant, editors,
*Proceedings of the ACM SIGKDD Intentional Conference on Knowledge Discovery in Databases and Data Mining (KDD-01)*, pages 67--76. ACM Press, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Search for unusually frequent itemsets using statistical methods. First, the authors propose stratification of the data to avoid finding spurious associations within strata. Then the deviation of the observed frequency over a baseline frequency (based on independence) is used. Since the deviation is unreliable for low counts, an empirical Bayes model (its 95% confidence limit) is used to produce a posterior distribution of the true ratio of actual to baseline frequencies. The Bayes model gives ratios close to the observed ratios for large samples and reduces (shrinks) the ratio if the sample size gets small (to smooth away noise). For multi-item associations log-linear models are proposed to find higher order associations which cannot be explained by pairwise associations.

- [BP01]
- Stephen D. Bay and Michael J. Pazzani. Detecting
group differences: Mining contrast sets.
*Data Mining and Knowledge Discovery*, 5(3):213--246, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Finds sets with substantially different support in different groups. Uses interest based pruning and statistical surprise for filtering (summarizing) contrast sets. The search error is controlled using different (Bonferroni) correction for sets of different size.

- [STB
^{+}02] - Gerd Stumme, Rafik Taouil, Yves Bastide, Nicolas Pasquier, and
Lotfi Lakhal. Computing iceberg concept lattices with titanic.
*Data & Knowledge Engineering*, 42(2):189--222, 2002. [ bib | find paper in Google Scholar | find paper in Google ]The paper shows how iceberg concept lattices can be used as a condensed method to represent and visualize frequent (closed) itemsets. Iceberg concept lattices only show the top-most part of a concept lattices (known from Formal Concept Analysis). To compute iceberg concept lattices the algorithm TITANIC is presented which computes closed sets (a closure system) in a level-wise approach using weights (e.g., support), equivalence classes and key sets (minimal sets in an equivalence class). TITANIC is compared experimentally to Next-Closure and performs better. PASCAL (Bastide et al. 2000) is a modified version of TITANIC to mine all frequent itemsets.

- [APY02]
- Charu C. Aggarwal, Cecilia Magdalena Procopiuc, and
Philip S. Yu. Finding localized associations in market basket
data.
*Knowledge and Data Engineering*, 14(1):51--62, 2002. [ bib | find paper in Google Scholar | find paper in Google ]Proposes to cluster transactions using a similarity measure based on the new affinity measure (measures similarity between pairs of items). Then mine association rules in the identified clusters.

- [SLTN03]
- Sam Y. Sung, Zhao Li, Chew L. Tan, and Peter A.
Ng. Forecasting association rules using existing data sets.
*IEEE Transactions on Knowledge and Data Engineering*, 15(6):1448--1459, Nov/Dec 2003. [ bib | find paper in Google Scholar | find paper in Google ]Resample datasets proportional to background attributes (e.g., distribution of customers' sex) to forecast rules in a new situation (e.g., a new store at a new location).

- [RMZ03]
- Ganesh Ramesh, William A. Maniatty, and Mohammed J.
Zaki. Feasible itemset distributions in data mining: theory and
application. In
*Symposium on Principles of Database Systems, PODS 2003*, San Diego, CA, USA, 2003. ACM Press. [ bib | find paper in Google Scholar | find paper in Google ]Studies the length distributions of frequent and frequent maximal itemsets (the number of frequent itemsets with the same length). The length distribution determines the algorithms performance and is important to generate realistic synthetic datasets.

- [PMS03]
- Dmitry Pavlov, Heikki Mannila, and Padhraic Smyth. Beyond
independence: Probabilistic models for query approximation on
binary transaction data.
*IEEE Transactions on Knowledge and Data Engineering*, 15(6):1409--1421, 2003. [ bib | find paper in Google Scholar | find paper in Google ]Investigates the use of probabilistic models (independence model, pair-wise interactions stored in a Chow-Liu Tree, mixtures of independence models, itemset inclusion-exclusion model, and the maximum entropy method) for the problem of generating fast approx. answers to queries for large sparse binary data sets.

- [HSM03]
- Jaakko Hollmén, Jouni K. Seppänen, and Heikki Mannila.
Mixture models and frequent sets: Combining global and local
methods for 0--1 data. In
*SIAM International Conference on Data Mining (SDM'03)*, San Fransisco, May 2003. [ bib | find paper in Google Scholar | find paper in Google ]Clusters binary data first using the EM-algorithms (looks like LCA; Cadez et al. (2001) seem to do the same to find profiles). Then the authors mine frequent itemsets in each cluster. Finally, they use the maximum entropy technique to obtain local models from the frequent itemsets and combine these models to approximate the joint distribution.

- [Yan04]
- Guizhen Yang. The complexity of mining maximal frequent
itemsets and maximal frequent patterns. In
*Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining*, Seattle, WA, USA, 2004. ACM Press. [ bib | find paper in Google Scholar | find paper in Google ]Shows that enumerating all maximal frequent itemsets is NP-hard and the associated counting problem is #P-complete.

- [Sch05]
- Tobias Scheffer. Finding association rules that trade support
optimally against confidence.
*Intelligent Data Analysis*, 9(4):381--395, 2005. [ bib | find paper in Google Scholar | find paper in Google ]Introduces predictive accuracy which is the expected value of the confidence of a rules with respect to the process underlying the database. The author shows how predictive accuracy can be calculated from confidence and support measured on a data set using a Bayesian frequency correction (very simplified: confidence is discounted for rules with low supports). Also an algorithm is presented which finds the top n most predictive association rules (redundant rules with a 0 predictive accuracy improvement are removed) and shows how to estimate the prior distribution needed for the correction.

- [Web06]
- Geoffrey I. Webb. Discovering significant rules. In
*KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining*, pages 434--443, New York, NY, USA, 2006. ACM Press. [ bib | DOI | find paper in Google Scholar | find paper in Google ]Comapares two approaches (the well-known Bonferroni adjustment and a new evaluation using holdout data) to control the experimentwise risk of false discoveries for statistical hypothesis tests. Experimental results indicate that neither of the two approaches dominates the other.

## Evaluation and efficient implementation of rule mining algorithms

- [KP88]
- Ron Kohavi and Foster Provost. Glossary of terms.
*Machine Learning*, 30(2--3):271--274, 1988. [ bib | find paper in Google Scholar | find paper in Google ]Definition of measures from Machine Learning. Theses are especially interesting for comparison and evaluation of association rule algorithms.

- [AIS93]
- Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Database
mining: A performance perspective.
*IEEE Transactions on Knowledge and Data Engineering*, 5(6):914--925, 1993. [ bib | find paper in Google Scholar | find paper in Google ]Places association rule mining together with classification and sequence mining into the context of rule discovery in database mining. The authors basic operations and an algorithm to discover classification rules. For the evaluation they generate artificial survey data using different classification functions.

- [AS94]
- Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for
mining association rules in large databases. In Jorge B.
Bocca, Matthias Jarke, and Carlo Zaniolo, editors,
*Proceedings of the 20th International Conference on Very Large Data Bases, VLDB*, pages 487--499, Santiago, Chile, September 1994. [ bib | find paper in Google Scholar | find paper in Google ]Introduction of the APRIORI algorithm (the best-known algorithm; it uses a breadth-first search strategy to counting the support of itemsets). The algorithm uses an improved candidate generation function which exploits the downward closure property of support and makes it more efficient than AIS. Also an algorithm to generate synthetic transaction data is presented. Such synthetic transaction data are widely used for the evaluation and comparison of new algorithms.

- [KBF
^{+}00] - Ron Kohavi, Carla Brodley, Brian Frasca, Llew Mason, and Zijian
Zheng. KDD-Cup 2000 organizers' report: Peeling the onion.
*SIGKDD Explorations*, 2(2):86--98, 2000. [ bib | find paper in Google Scholar | find paper in Google ]Introduces also some freely available data sets for algorithm performance evaluation.

- [ZKM01]
- Zijian Zheng, Ron Kohavi, and Llew Mason. Real world
performance of association rule algorithms. In F. Provost and
R. Srikant, editors,
*Proceedings of the ACM SIGKDD Intentional Conference on Knowledge Discovery in Databases and Data Mining (KDD-01)*, pages 401--406. ACM Press, 2001. [ bib | find paper in Google Scholar | find paper in Google ]Compares the performance of association rule algorithms (APRIORI, CHARM, FP-growth, CLOSET, MagnumOpus) using one IBM-Artificial dataset and three real-world e-commerce datasets. It shows that some improvements demonstrated on artificial datasets do not carry over to real-world datasets.

- [CSM01]
- Igor V. Cadez, Padhraic Smyth, and Heikki Mannila.
Probabilistic modeling of transaction data with applications to
profiling, visualization, and prediction. In F. Provost and
R. Srikant, editors,
The authors construct a model (profile with weights) for each individual's behavior as a mixture of several components. This mixture provides the probabilities for a multinomial probability model (each item has a constant probability to be chosen for a transaction). Finally, the authors compare several estimation methods and model variants empirically using store choice data.

- [LNWJ01]
- Yingjiu Li, Peng Ning, X. Sean Wang, and Sushil Jajodia.
Generating market basket data with temporal information. In
*ACM KDD Workshop on Temporal Data Mining*, August 2001. [ bib | find paper in Google Scholar | find paper in Google ]Develop a generator for synthetic data with temporal patterns based on the generator by Agrawal and Srikan (1994).

- [BK02]
- Christian Borgelt and Rudolf Kruse. Induction of association
rules: Apriori implementation. In
*15th Conference on Computational Statistics (Compstat 2002)*, Heidelberg, Germany, 2002. Physica Verlag. [ bib | find paper in Google Scholar | find paper in Google ]An efficient implementation of APRIORI.

- [OLP
^{+}03] - Salvatore Orlando, Claudio Lucchese, Paolo Palmerini, Raffaele
Perego, and Fabrizio Silvestri. kdci: a multi-strategy algorithm
for mining frequent sets. In Bart Goethals and Mohammed J.
Zaki, editors,
*FIMI'03: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations*, November 2003. [ bib | find paper in Google Scholar | find paper in Google ]Introduces the kDCI algorithm.

- [Bor03]
- Christian Borgelt. Efficient implementations of apriori and
eclat. In Bart Goethals and Mohammed J. Zaki, editors,
*Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations*, Melbourne, FL, USA, November 2003. [ bib | find paper in Google Scholar | find paper in Google ]Discusses the efficient implementation of APRIORI (with prefix tree) and ECLAT.

- [GZ04]
- Bart Goethals and Mohammed J. Zaki. Advances in frequent
itemset mining implementations: Report on FIMI'03.
*SIGKDD Explorations*, 6(1):109--117, 2004. [ bib | find paper in Google Scholar | find paper in Google ]This paper reports on the performance of different frequent itemset mining implementations on several real-world and artificial databases. The authors conclude that the latest algorithms (patricia, kdci, fpclose, fpmax*) outperform older ones but that currently no tested algorithm gracefully scales up to very large databases with millions of transactions.

- [CLA04]
- Frans Coenen, Paul Leng, and Shakil Ahmed. Data structures for
association rule mining: T-trees and P-trees.
*IEEE Transactions on Knowledge and Data Engineering*, 16(6):774--778, 2004. [ bib | find paper in Google Scholar | find paper in Google ]Describes two new structures for association rule mining: T-trees (total support trees) and P-trees (partial support trees). The T-tree is a data structure (a compressed set enumeration tree) to store itemsets. The P-tree is a compressed way to represent a database in memory for mining.

- [JSL
^{+}05] - Daniel R. Jeske, Behrokh Samadi, Pengyue J. Lin, Lan
Ye, Sean Cox, Rui Xiao, Ted Younglove, Minh Ly, Douglas Holt, and
Ryan Rich. Generation of synthetic data sets for evaluating the
accuracy of knowledge discovery systems. In
*Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining*, pages 756--762, New York, NY, USA, 2005. ACM Press. [ bib | DOI | find paper in Google Scholar | find paper in Google ]Generate synthetic data (e.g., credit card transaction data) for accuracy evaluation using semantic graphs.

- [HH07]
- Michael Hahsler and Kurt Hornik. New probabilistic interest
measures for association rules.
*Intelligent Data Analysis*, 11(5):437--455, 2007. [ bib | find paper in Google Scholar | find paper in Google ]Presents a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. Uses such data and a real-world grocery database to explore the behavior of confidence and lift, two popular interest measures used for rule mining. Also introduces the new probabilistic measures hyper-lift and hyper-confidence.