Apriori

The apriori algorithm uses three metrics to identify associated items. Assuming I made 3 purchases at a local store:

  1. {A, B, C}
  2. {B, C}
  3. {A, B}

Terminology
Transaction: each row above is called transaction
Itemset: {A,B,C} is an itemset
Rule: {A} => {B}, when {A} is in the basket, it implies that {B} will also be in the basket

By looking at the list, we can see that {A,B} & {B,C} appear together more often. To quantify the strength of this relationship, three metrics are used:

Support: P (A ∪ B) ÷ N, the probability of both A & B appearing together over the total number of transactions (i.e. how often do you see A & B together)
Confidence = P (B | A) = P (A ∪ B) ÷ P(A), i.e. how often do you see B when A is already in the basket. The notion of P(B | A) is called conditional probability and it is read as the probability of B given A.
Lift = P (B | A) ÷ P(B), it means the chance of B given A versus the chance of B happening along. In layman’s term, if A & B always appear together, then A & B should appear more often then B appears along in the basket. A value of 1 means A does not affect how often B appears, they are independent. If the value > 1, then it means A has an effect on B, the larger the value the stronger the effect. If value < 1, then it means the more A appears, the less A&B appears.

Example
Rule: {A} -> {B}
Support = 2 (A & B appear 2 times) ÷ 3 (three transactions in total)
Confidence = 2 (A & B appear 2 times) ÷ 2 (A appears 2 times)
Lift of (A & B) = 2 (A & B appear 2 times) ÷ 2 (A appears 2 times) ÷ 3 (B appears 3 times)

The Apriori algorithm can generate many rules, let’s say there are d number of unique items in a transactions, then the total number of rules for this particular itemset equals to $ 3^d – 2^{d+1} + 1$

For itemset {A,B,C} : $ \text{total number of rules} = 3^3 – 2^ 4 + 1 = 27 – 16 + 1 = 12$

It is a good exercise to write them out:
A -> B
B -> A
A -> C
C -> A
A -> BC
BC -> A
B -> C
C -> B
B -> AC
AC -> B
CA -> B
B -> CA

As the number of items grows, the number of rules grows much quicker. It would be too time-consuming to calculate all of the rules without filtering out meaningless data. Thus, minsup (minimum support) and minconf (minimum confidence) are used.

Thus, to interpret Apriori result quickly, choose a minsup and minconf to filter out useless data, and then order the rules with the highest Lift to lowest and filter out any Lift <= 1. Not all of the rules make sense, they still need to be interpreted by a human.

Reference:
https://annalyzin.wordpress.com/2016/04/01/association-rules-and-the-apriori-algorithm/
https://www-users.cs.umn.edu/~kumar001/dmbook/ch6.pdf

 
Bitnami