Q.9 Explain the dynamic quantitative association rule mining for multidimensional association rules
Ans. Quantitative association rules are multidimensional association rules in which the numeric attributes are dynamically discretized. Let,
Aquan1 ^ Aquan2 → Acat
where Aquan1 and Aquan2 are tests on quantitative attribute intervals, and Acat tests a categorical attribute from the task-relevant data. Such rules have been referred to as two-dimensional quantitative association rules.
Let's look at an approach used in a system called ARCS (Association Rule Clustering System). This approach maps pairs of quantitative attributes onto a 2-D grid for tuples. The grid is then searched for clusters of points from which the association rules are generated. The following steps are involved in ARCS.
Binning – Just think about how big a 2-D grid would be if we plotted age and income as axes, where each possible value of age was assigned a unique position on one axis, and similarly, each possible value of income was assigned a unique position on the other axis ! To keep grids down to a manageable size, we instead partition the ranges of quantitative attributes into intervals. These intervals are dynamic in that they may later be further combined during the mining process. The partitioning process is referred to as binning.
Three common binning strategies are as follows –
(i) Equal-width Binning - Where the interval size of each bin is the same.
(ii) Equal-frequency Binning - Where each bin has approximately the same number of tuples assigned to it.
(iii) Clustering-based Binning – Where clustering is performed on the quantitative attribute to group neighboring points into the same bin.
Finding Frequent Predicate Sets – Once the 2-D array containing the count distribution for each category is set up, it can be scanned to find the frequent predicate sets that also satisfy minimum confidence. Strong association rules can then be generated from these predicate sets, using a rule generation algorithm.
Clustering the Association Rules – Let's look at an example of a 2-D quantitative association rule is,
age(X, “30....39") ^ income(X, “42K.....48K”) = buys(X, “HDTV)
Fig.b shows a 2-D grid for 2-D quantitative association rules predicting the condition buys (X, “HDTV”) on the rule right-hand side, given the quantitative attributes age and income. The four Xs correspond to the rules.
age(X, 34) ^ income(X, “31K.....40K") = buys(X, “HDTV'')
age(X, 35) ^ income(X, “31K......40K”) = buys(X, “HDTV)
age(X, 34) ^ income(X, “41K.....50K") = buys(X, “HDTV)
age(X, 35) ^ income(X, “41K......50K”) = buys(X, "HDTV")
These rules are quite “close” to one another, forming a rule cluster on the grid. The four rules can be combined or "clustered” together to form the following simple rule, which subsumes and replaces above four rules –
age(X, “34.....35") ^ income(X, “31K......50K”) = buys(X, “HDTV")
ARCS employs a clustering algorithm for this purpose. The algorithm scans the grid, searching for rectangular clusters of rules. In this way, bins of the quantitative attributes occurring within a rule cluster may be further combined, and hence further dynamic discretization of the quantitative attributes occurs.