Apriori Algorithm

Tags:

I've heard about the Apriori algorithm several times before but never got the time or the opportunity to dig into it, can anyone explain to me in a simple way the workings of this algorithm? Also, a basic example would make it a lot easier for me to understand.

643

asked Aug 08 '09 08:08

Alix Axel

3 Answers

Apriori Algorithm

It is a candidate-generation-and-test approach for frequent pattern mining in datasets. There are two things you have to remember.

Apriori Pruning Principle - If any itemset is infrequent, then its superset should not be generated/tested.

Apriori Property - A given (k+1)-itemset is a candidate (k+1)-itemset only if everyone of its k-itemset subsets are frequent.

Now, here is the apriori algorithm in 4 steps.

Initially, scan the database/dataset once to get the frequent 1-itemset.
Generate length k+1 candidate itemsets from length k frequent itemsets.
Test the candidates against the database/dataset.
Terminate when no frequent or candidate set can be generated.

Solved Example

Suppose there is a transaction database as follows with 4 transactions including their transaction IDs and items bought with them. Assume the minimum support - min_sup is 2. The term support is the number of transactions in which a certain itemset is present/included.

Transaction DB

Click to copy

tid  | items
-------------
10   | A,C,D
20   | B,C,E
30   | A,B,C,E
40   | B,E

Now, let's create the candidate 1-itemsets by the 1st scan of the DB. It is simply called as the set of C_1 as follows.

Click to copy

itemset | sup
-------------
  {A}   |  2
  {B}   |  3
  {C}   |  3
  {D}   |  1
  {E}   |  3

If we test this with min_sup, we can see {D} does not satisfy the min_sup of 2. So, it will not be included in the frequent 1-itemset, which we simply call as the set of L_1 as follows.

Click to copy

itemset | sup
-------------
  {A}   |  2
  {B}   |  3
  {C}   |  3
  {E}   |  3

Now, let's scan the DB for the 2nd time, and generate candidate 2-itemsets, which we simply call as the set of C_2 as follows.

Click to copy

itemset | sup
-------------
  {A,B} |  1
  {A,C} |  2
  {A,E} |  1
  {B,C} |  2
  {B,E} |  3
  {C,E} |  2

As you can see, {A,B} and {A,E} itemsets do not satisfy the min_sup of 2 and hence they will not be included in the frequent 2-itemset, L_2

Click to copy

itemset | sup
-------------
  {A,C} |  2
  {B,C} |  2
  {B,E} |  3
  {C,E} |  2

Now let's do a 3rd scan of the DB and get candidate 3-itemsets, C_3 as follows.

Click to copy

itemset  | sup
-------------
 {A,B,C} |  1
 {A,B,E} |  1
 {A,C,E} |  1
 {B,C,E} |  2

You can see that, {A,B,C}, {A,B,E} and {A,C,E} does not satisfy min_sup of 2. So they will not be included in frequent 3-itemset, L_3 as follows.

Click to copy

itemset  | sup
-------------
 {B,C,E} |  2

Now, finally, we can calculate the support (supp), confidence (conf) and lift (interestingness value) values of the Association/Correlation Rules that can be generated by the itemset {B,C,E} as follows.

Click to copy

         Rule       | supp  |  conf   |  lift
    -------------------------------------------
     B ->  C & E    |  50%  |  66.67% |  1.33
     E ->  B & C    |  50%  |  66.67% |  1.33
     C ->  E & B    |  50%  |  66.67% |  1.77
     B & C ->  E    |  50%  |  100%   |  1.33
     E & B ->  C    |  50%  |  66.67% |  1.77
     C & E ->  B    |  50%  |  100%   |  1.33

199

answered Oct 16 '22 16:10

Thilina Samiddhi

See Top 10 algorithms in data mining (free access) or The Top Ten Algorithms in Data Mining. The latter gives a detailed description of the algorithm, together with details on how to get optimized implementations.

answered Oct 16 '22 16:10

lmsasu

Well, I would assume you've read the wikipedia entry but you said "a basic example would make it a lot easier for me to understand". Wikipedia has just that so I'll assume you haven't read it and suggest that you do.

Read the wikipedia article.

answered Oct 16 '22 16:10

Spencer Ruport

Related questions
                            
                                K- Means algorithm
                            
                                Algorithm for Finding Good, Reliable Players
                            
                                How do I generate a tournament schedule in Ruby?
                            
                                How to get unrotated display object width/height of a rotated display object?
                            
                                Is there any way to divide rgb color palette?
                            
                                find inorder successor in BST without using any extra space
                            
                                How to best create a Timer Function in R
                            
                                Which is the appropriate data structure?
                            
                                Rotate array by arbitrary step size without creating second array
                            
                                Iterate through an array of arbitrary dimension
                            
                                Search Algorithm to find the k lowest values in a list
                            
                                Count word frequency of huge text file [duplicate]
                            
                                How to find connected components in Matlab?
                            
                                Flood Fill in Python
                            
                                Reverse integer bitwise without using loop
                            
                                Check if string contains the substring's sequence of characters in order, but not necessarily right next to each other
                            
                                Haskell - C stack overflow
                            
                                Primitive Calculator - Dynamic Approach
                            
                                How do you iterate over a tree?
                            
                                How can I improve this square root method?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Apriori Algorithm

Tags:

algorithm

apriori

Alix Axel

People also ask

3 Answers

Apriori Algorithm

Solved Example

Thilina Samiddhi

lmsasu

Spencer Ruport

Recent Activity

Donate For Us