The apriori algorithm uses **three metrics** to identify associated items. Assuming I made 3 purchases at a local store:

- {A, B, C}
- {B, C}
- {A, B}

**Terminology** **Transaction**: each row above is called transaction**Itemset**: {A,B,C} is an itemset**Rule**: {A} => {B}, when {A} is in the basket, it implies that {B} will also be in the basket

By looking at the list, we can see that {A,B} & {B,C} appear together more often. To quantify the strength of this relationship, three metrics are used:

**Support**: P (A ∪ B) ÷ N, the probability of both A & B appearing together over the total number of transactions (i.e. how often do you see A & B together)**Confidence** = P (B | A) = P (A ∪ B) ÷ P(A), i.e. how often do you see B when A is already in the basket. The notion of P(B | A) is called conditional probability and it is read as the probability of B given A.**Lift** = P (B | A) ÷ P(B), it means the chance of B given A versus the chance of B happening along. In layman’s term, if A & B always appear together, then A & B should appear more often then B appears along in the basket. A value of 1 means A does not affect how often B appears, they are independent. If the value > 1, then it means A has an effect on B, the larger the value the stronger the effect. If value < 1, then it means the more A appears, the less A&B appears.

**Example**

Rule: {A} -> {B}

Support = 2 (A & B appear 2 times) ÷ 3 (three transactions in total)

Confidence = 2 (A & B appear 2 times) ÷ 2 (A appears 2 times)

Lift of (A & B) = 2 (A & B appear 2 times) ÷ 2 (A appears 2 times) ÷ 3 (B appears 3 times)

The Apriori algorithm can generate many rules, let’s say there are `d`

number of unique items in a transactions, then the `total number of rules`

for this particular itemset equals to $ 3^d – 2^{d+1} + 1$

For itemset {A,B,C} : $ \text{total number of rules} = 3^3 – 2^ 4 + 1 = 27 – 16 + 1 = 12$

It is a good exercise to write them out:

A -> B

B -> A

A -> C

C -> A

A -> BC

BC -> A

B -> C

C -> B

B -> AC

AC -> B

CA -> B

B -> CA

As the number of items grows, the number of rules grows much quicker. It would be too time-consuming to calculate all of the rules without filtering out *meaningless* data. Thus, `minsup`

(minimum support) and `minconf`

(minimum confidence) are used.

Thus, to interpret Apriori result quickly, choose a `minsup`

and `minconf`

to filter out useless data, and then order the rules with the highest `Lift`

to lowest and filter out any `Lift <= 1`

. Not all of the rules make sense, they still need to be interpreted by a human.

**Reference**:

https://annalyzin.wordpress.com/2016/04/01/association-rules-and-the-apriori-algorithm/

https://www-users.cs.umn.edu/~kumar001/dmbook/ch6.pdf