Data Mining

Q2 What is support and confidence ?

Support and Confidence

Several objective measures of pattern interestingness exist. These are based on the structure of discovered patterns and the statistics underlying them. An objective measure for association rules of the form X ⇒ Y is rule support, representing the percentage of transactions from a transaction database that the given rule satisfies. This is taken to be the probability P(X ∪Y), where X ∪Y indicates that a transaction contains both X and Y, that is, the union of itemsets X and Y. Another objective measure for association rules is confidence, which assesses the degree of certainty of the detected association. This is taken to be the conditional probability P(Y|X), that is, the probability that a transaction containing X also contains Y. More formally, support and confidence are defined as

support(X ⇒ Y) = P(X∪Y),

confidence(X ⇒ Y) = P(Y|X).

In general, each interestingness measure is associated with a threshold, which may be controlled by the user. For example, rules that do not satisfy a confidence threshold of, say, 50% can be considered uninteresting. Rules below the threshold likely reflect noise, exceptions, or minority cases and are probably of less value.

