Association Rules and Sequential Patterns Sequential Patterns

Association Rules and
Sequential Patterns
Prof.dr.ing. Florin Radulescu
Universitatea Politehnica din Bucureşti
Road Map
Frequent itemsets and rules

Apriori algorithm
FP-Growth
Data formats
Class association rules
Sequential patterns. GSP algorithm
Florin Radulescu, Note de curs
2 DMDW-3
Objectives
Association rule learning was introduced in the article
“Mining Association Rules Between Sets of Items in
Large Databases” by Agrawal, Imielinski and Swami (see
references), article presented at the 1993 SIGMOD
Conference (ACM SIGMOD means ACM Special
Interest Group on Management of Data).
One of the best known applications is finding
relationships (rules) between products as recorded by
POS systems in supermarkets.
For example, the statement that 85% of baskets that
contain bread contain also mineral water is a rule with
bread as antecedent and mineral water as consequent.
3 DMDW-3
Objectives
The original article (Agrawal et al.) lists some
examples of the expected results:
Find all rules that have “Diet Coke” as
consequent,
Find all rules that have “bagels” in the
antecedent,
Find all rules that have “sausage” in the
antecedent and “mustard” in the consequent,
Find all the rules relating items located on
shelves A and B in the store,
Find the “best” k rules (considering rule support)
that have “bagels” in the consequent
4 DMDW-3
Frequent itemsets and rules:

Items and transactions
Association rules
Goals for mining transactions
5 DMDW-3
Let I = {i1, i2, …, in} be a set of items. For example, items may
be all products sold in a supermarket or all words contained in
some documents.
A transaction t is a set of items, with t ⊆ I. Examples of
transactions are market baskets containing products or
documents containing words.
A transaction dataset (or database) T is a set of
transactions, T = {t1, t2, …, tm}. Each transaction may contain
a different number of items and the dataset may be stored in
a SGBD managed database or in a text file.
An itemset S is a subset of I. If v = |S| is the number of items
in S (or the cardinal of S), then S is called a v-itemset.
6 DMDW-3
The support of an itemset X, sup(X), is equal to the
number (or proportion) of transactions in T containing
X. The support may be given as an absolute value
(number of transactions) or proportion or percent
(proportion of transactions).
In many cases we want to find itemsets with a support
greater or equal with a given value (percent) s. Such
an itemset is called frequent itemset and, in the
market basket example, it contains items that can be
found together in many baskets (where the measure
of ‘many’ is s).
These frequent itemsets are the source of all ‘powerful’
association rules.
7 DMDW-3
Example 1
Let us consider a dataset containing market basket
transactions:
I = {laptop, mouse, tablet, hard-drive, monitor,
keyboard, DVD-drive, CD-drive, flash-memory, . . .}
T = {t1, t2, …, tm}
t1 = {laptop, mouse, tablet}
t2 = {hard-drive, monitor, laptop, keyboard, DVD-drive}
. . .
tm = {keyboard, mouse, tablet)
S = {keyboard, mouse, monitor, laptop} is a 4-itemset.
8 DMDW-3
Example 2
If items are words and transactions are documents,
where each document is considered a bag of words,
then we can have:
T = {Doc1, Doc2, …, Doc6}
Doc1 = {rule, tree, classification}
Doc2 = {relation, tuple, join, algebra, recommendation}
Doc3 = {variable, loop, procedure, rule}
Doc4 = {clustering, rule, tree, recommendation}
Doc5 = {join, relation, selection, projection,
classification}
Doc6 = {rule, tree, recommendation}
9 DMDW-3
Example 2
In that case:
sup({rule, tree}) = 3 or 50% or 0.5
sup({relation, join}) = 2 or 33.33% or 1/3
If the threshold is s = 50% (or 0.5) then {rule, tree} is
frequent and {relation, join} is not.
10 DMDW-3

Association rules
11 DMDW-3
Association rules
If X and Y are two itemsets, an

association rule is an implication of the
form X → Y where X ∩ Y = ∅.
X is called the antecedent of the rule (RO:
antecedent), Y is the consequent (RO:
consecinta).
For each association rule we can compute
the support of the rule and the
confidence of the rule.
12 DMDW-3
Association rules
If m is the number of transactions in T then:
sup(X → Y) = sup(X ∪ Y) - as absolute value or
sup(X → Y) = sup(X ∪ Y) / m - as proportion
conf(X → Y) = sup(X ∪ Y) / sup(X)
where the support of an itemset is given as absolute value
(number of transactions).
Trans. Containing
X∪Y
Trans. Containing X
T, |T| = m
13 DMDW-3
Association rules
The support of a rule X → Y is given by

the proportion of transactions in T
containing both X and Y.
The confidence of a rule X → Y is given
by the proportion of transactions
containing Y in the set of transactions
containing X (the set of transactions
containing X ∪ Y is included in the set of
transactions containing X). Florin Radulescu, Note de curs
14 DMDW-3
Association rules
If the support of a rule is high it

describes a relationship between itemsets
that are found together in many
transactions (itemsets in many baskets or
word sets in many documents).
If the confidence of a rule X → Y is high
then if a transaction contains X then with a
high probability (equal to the confidence of
the rule) it also contains Y Florin Radulescu, Note de curs
15 DMDW-3
Example 3
Consider the set of six transactions in Example 2:
Doc2 = {relation, tuple, join, algebra,
recommendation}
classification}
16 DMDW-3
Example 3
With a minimum support of 50% we find that
{rule, tree} is a frequent itemset. The two rules
derived from this itemset have the same
minimum support:
rule → tree
with sup = 50% and conf = 3 / 4 = 75% and
tree → rule
with sup = 50% and conf = 3 / 3 = 100%
If the minimum confidence required is 80% then
only the second rule is kept, the first being
considered not enough powerful.
17 DMDW-3

Association rules
18 DMDW-3
Finding association rules
The previous paragraph states that if the support

of a rule X → Y is high, then X and Y can be
found together in many transactions.
Consequently, both itemsets are part of a
frequent itemset (considering the same
minimum support), or in other terms, each such
rule can be determined starting from a frequent
itemset and dividing it in two disjoint parts: X and
Y.
19 DMDW-3
Finding association rules
That means that the process of finding all the rules
given the minimum support and the minimum
confidence has three steps:
Step 1. Find all frequent itemsets containing at least
two items, considering the given minimum support
minsup.
Step 2. For each frequent itemset U found in step 1
list all splits (X, Y) with X∩Y=∅ and X∪Y=U. Each
split generates a rule X → Y.
Step 3. Compute the confidence of each rule. Keep
only the rules with confidence at least minconf.
20 DMDW-3
Goal 1: Find frequent itemsets. Frequent
itemsets can be used not only to find rules but also
for marketing purposes.
As an example, in a supermarket, frequent
itemsets helps marketers to place items in an
effort to control the way customers walk through
the store:
Items that are sold together are placed for
example in distant corners of the store such that
customers must go from one product to another
possibly putting other products in the basket on
the way.
21 DMDW-3
Goal 2
Find association rules. Such a rule tells that

people that buy some items X buy some other
items Y with a high probability. Association rules
may also be used for marketing purposes.
A well known example in the domain literature is
the rule:
Diapers → Beer
22 DMDW-3
Diapers → Beer
In [Whitehorn 06] this example is described as follows:
“Some time ago, Wal-Mart decided to combine the
data from its loyalty card system with that from its
point of sale systems.
The former provided Wal-Mart with demographic data
about its customers, the latter told it where, when and
what those customers bought.
Once combined, the data was mined extensively and
many correlations appeared.
Some of these were obvious; people who buy gin are
also likely to buy tonic. They often also buy lemons.
However, one correlation stood out like a sore thumb
because it was so unexpected.
23 DMDW-3
Diapers → Beer
On Friday afternoons, young American males who
buy diapers (nappies) also have a predisposition
to buy beer.
No one had predicted that result, so no one would
ever have even asked the question in the first place.
Hence, this is an excellent example of the difference
between data mining and querying.”
This example is only a Data Mining myth (as Daniel

Power described in
http://www.dssresources.com/newsletters/66.php) but
was for many years a very used example because the
psychological impact.
24 DMDW-3
Goal 3
In [Ullman 03-09] is listed also a third goal for
mining transactions:
Goal 3: Find causalities. In the case of the rule
Diapers → Beer a natural question is if the left
part of the rule (buying diapers) causes the right
part (buy also beer).
Causal rules can be used in marketing: a low
price of diapers will attract diaper buyers and an
increase of the beer price will grow the overall
sales numbers.
25 DMDW-3
Algorithms
There are many algorithms for finding frequent
itemsets and consequently the association rules
in a dataset.
All these algorithms are developed for huge
volumes of data, meaning that the dataset is too
large to be loaded and processed in the main
memory.
For that reason minimizing the number of times
the data are read from the disk become a key
feature of each algorithm.
26 DMDW-3
Road Map

Apriori algorithm
Apriori principle
Apriori algorithm
FP-Growth
Data formats
27 DMDW-3
Apriori algorithm
This algorithm is introduced in 1994 in [Agrawal,

Srikant 94], at the VLDB conference (VLDB =
International Conference on Very Large
Databases).
It is based on the Apriori principle (called also
monotonicity or downward closure property).
28 DMDW-3
Apriori principle
The Apriori principle states that any subset of a
frequent itemset is also a frequent itemset.
Example 4: If {1, 2, 3, 4} is a frequent itemset then all
its four subsets with 3 values are also frequent: {1,
2, 3}, {1, 2, 4}, {1, 3, 4} and {2, 3, 4}.
Consequently each frequent v-itemset is the reunion
of v (v-1)-itemsets.
That means we can determine the frequent itemsets
with dimension v examining only the set of all
frequent itemsets with dimension (v-1).
29 DMDW-3
Apriori principle
It means that an algorithm for finding

frequent itemsets must:
1. find frequent 1-itemsets (frequent items)
2. find frequent 2-itemsets considering all pairs
of frequent items found in step 1
3. Find frequent 3-itemsets considering all
triplets with each subset in the frequent pairs
set found in step 2
4. . . . and so on.
30 DMDW-3
Apriori principle
It is a level-wise approach where each step
requires a full scan of the dataset (residing on
disk).
A diagram is presented in the next slide where Ci
is the set of candidates for frequent i-itemsets
and Li is the actual set of frequent i-itemsets.
C1 is the set of all itemsets found in transactions
(a subset of I) and may be obtained either by a
reunion of all transactions in T or by considering
C1 = I (in that case some items may have a zero
support)
31 DMDW-3
Using Apriori Principle
C1 L1
C2 L2
C3 L3
Ck Lk
The process stops in two cases:

No candidate from Ck has the support at least minsup (Lk
is empty)
There is no (k+1)-itemset with all k-subsets in Lk
(meaning that Ck+1 is empty)
32 DMDW-3
Apriori Algorithm
The algorithm is described in [Agrawal, Srikant 94] and
uses the level-wise approach described before:
A first scan of the dataset leads to the L1 (the set of
frequent items). For each transaction t in T and for
each item a in t the count of a is increased
(a.count++). At the end of the scan L1 will contain all
items with a count at least minsup (given as absolute
value).
For k=2, 3, … the process continues by generating the
set of candidates Ck and then counting the support of
each candidate by a full scan of the dataset.
Process ends when Lk is empty.
33 DMDW-3
Apriori Algorithm
Algorithm Apriori (T)
L1 = scan (T);
for (k = 2; Lk-1 ≠ ∅; k++) do
Ck ← apriori-gen(Lk-1);
for each transaction t ∈ T do
for each candidate c ∈ Ck do
if c is contained in t then
c.count++;
end
end
Lk ← {c ∈ Ck | c.count ≥ minsup}
end
return L = ∪ Lk;
34 DMDW-3
Candidate generation
Candidate generation is also described in the
original algorithm as having two steps: the join
step and the prune step. The first builds a larger
set of candidates and the last removes some of
them proved impossible to be frequent.
In the join step each candidate is obtained from
two different frequent (k-1)-itemsets containing
(k-2) identical items:
Ck = { {i1, …, ik-1, i’k-1} | ∃p={i1, …, ik-1} ∈ Lk-1 ,
∃q={i1, …, i’k-1}∈ Lk-1 , ik-1 < i’k-1}
35 DMDW-3
Candidate generation
The prune step removes those candidates

containing a (k-1)-itemset that is not in Lk-1 (if a
subset of an itemset is not frequent the whole
itemset cannot be frequent, as a consequence of
the Apriori principle):
The join step is described in [Agrawal, Srikant
94] as a SQL query (see next slide). Each
candidate in Ck comes from two frequent
itemsets in Lk-1 that differ by a single item
36 DMDW-3
Join
The join step

insert into Ck
select p.item1, p.item2, ..., p.itemk-1, q.itemk-1
from Lk-1 p, Lk-1 q
where p.item1 = q.item1, . . ., p.itemk-2 = q.itemk-2,
p.itemk-1 < q.itemk-1;
37 DMDW-3
Join and prune
The prune step

foreach c in Ck
foreach (k-1)-subset q of c do
if (s ∉ Lk-1) then
remove c from Ck;
end
end
38 DMDW-3
Example 5
Consider again the set of six transactions in
Example 2:
Doc2 = {relation, tuple, join, algebra,
recommendation}
classification}
and a minimum support of 50% (minsup=3).
39 DMDW-3
Step 1
At the first scan of the transaction dataset T the

support for each item is computed:
Rule 4 recommendation 3
Tree 3 variable 1
classification 2 loop 1
relation 2 procedure 1
Tuple 1 clustering 1
Join 2 selection1 1
algebra 1 projection 1
With a minsup=3, L1 = { {rule}, {tree},

{recommendation} }.
40 DMDW-3
Step 2
Considering:
rule < tree < recommendation
From the join C2 = { {rule, tree}, {rule,
recommendation}, {tree, recommendation} }.
The prune step does not modify C2.
The second scan of the transaction dataset
leads to the following pair support values:
41 DMDW-3
Step 2
Step 2 {rule, tree} 3
{rule, recommendation} 2
{tree, recommendation} 2
The only frequent pair is {rule, tree}. L2 = { {rule,

tree} }.
Step 3. Because L2 has a single element, C3 = ∅,
so L3 = ∅, and the process stops. L = L1 ∪ L2 = {
{rule}, {tree}, {recommendation}, {rule, tree} }. If we
consider only maximal itemsets, L = {
{recommendation}, {rule, tree} }.
42 DMDW-3
Example 6
Consider the transaction dataset {(1, 2, 3, 5), (2, 3,
4), (3, 4, 5)} and the minsup s = 50% (or s = 3/2;
because s must be an integer ⇒ s = 2)
C1 = {1, 2, 3, 4, 5}
L1 = {2, 3, 4, 5}
C2 = {(2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5) }
L2 = {(2, 3), (3, 4), (3, 5)}
After the join step C3 = {(3, 4, 5)} - obtained by
joining (3, 4) and (3, 5).
In the prune step (3, 4, 5) is removed because
its subset (4, 5) is not in L2
43 DMDW-3
Example 6
After the prune step C3 = ∅, so L3 = ∅, and the
process stops. L = L1 ∪ L2 = {(2), (3), (4), (5), (2,
3), (3, 4), (3, 5)} or as maximal itemsets L = L2.
The rules generated from itemsets are:
2 → 3, 3 → 2, 3 → 4, 4 → 3, 3 → 5, 5 → 3.
The support of these rules is at least 50%.
Considering a minconf equal to 80% only 3 rules
have a confidence greater or equal with minconf:
2 → 3, 4 → 3, 5 → 3 (with conf = 2/2 = 100%).
The rules having 3 in the antecedent have a
confidence of 67% (because sup(3) = 3 and sup(2,
3) = sup(3, 4) = sup(3, 5) = 2).
44 DMDW-3
Apriori summary
Apriori algorithm performs a level-wise search.

If the largest frequent itemset has k items, the
algorithm must perform k passes over the data
(most probably stored on disk) for finding all
frequent itemsets.
In many implementations the maximum number
of passes may be specified, leading to frequent
itemsets with the same maximum dimension.
45 DMDW-3
Road Map

Apriori algorithm
FP-Growth
Data formats
46 DMDW-3
FP-Growth
FP-Growth (Frequent Pattern Growth)

algorithm performs frequent itemsets
discovery without candidate generation. It
has a 2 steps approach:
Step 1: Build FP-tree. Requires only 2 passes
over the dataset.
Step 2: Extracts frequent itemsets from the
FP-tree.
47 DMDW-3
Build the FP-tree
At the first pass over the data frequent items are

discovered.
All other items (infrequent items) are discarded:
transactions will contain only frequent items.
Also, these items are ordered by their support in
decreasing order.
48 DMDW-3
Build the FP-tree
At the second pass over the data the FP-tree is

built:
FP-Growth considers an ordered set of items
(frequent items ordered by their support).
Each transaction is written with items in that
order.
The algorithm reads a transaction at a time and
adds a new branch to the tree, branch
containing as nodes the transaction items.
49 DMDW-3
Build the FP-tree
Pass 2 - cont.:
Each node has a counter.
If two transactions have the same prefix the two
branches overlap on the nodes of the common
prefix and the counter of those nodes are
incremented.
Also, nodes with the same item are linked by
orthogonal paths.
50 DMDW-3
Example 7
Let’s consider the following transaction set

and minsup=3
TID Items
1 a, b, c
2 a, b, c, d
3 a, b, f
4 a, b, d
5 c, d, e
51 DMDW-3
Example 7
Item counts. Only a, b, c, d are frequent:
a 4
b 4
c 3
d 3
e 1
f 1
52 DMDW-3
Example 7
After discarding nonfrequent items and

ordering descending using the item
support the transaction dataset is the
following. Note that the order is a, b, c, d.
TID Items
1 a, b, c
2 a, b, c, d
3 a, b
4 a, b, d
5 c, d
53 DMDW-3
Example 7
null
a:4
Item Support
a 4 b:4
b 4
c 3 c:2 c:1
d 3
d:1 d:1 d:1
54 DMDW-3
Extract frequent itemsets
• After building the FP-tree the algorithm starts to build
partial trees (called conditional FP-trees) ending with a
given item (a suffix).
• The item is not present in the tree but all frequent
itemsets generated from that conditional tree will contain
that item.
• In building the conditional FP-tree, non-frequent items
are skipped (but the branch remains if there are still
nodes on it).
• In the previous example trees ending with d, c, b and a
may be built.
• For each tree the dotted path is used for starting points
and the algorithm goes up propagating the counters
(with sums where two branches go to the same node).
55 DMDW-3
d conditional FP-tree
Item Support null
a 2 a:2
b 2
c 2 b:2
c:1 c:1
56 DMDW-3
c conditional FP-tree
Because all items have a support below minsup,

no itemset containing d is frequent.
The same situation is for the c conditional FP-
tree:
null
Item Support
a 2 a:2
b 2
b:2
57 DMDW-3
b conditional FP-tree
b conditional FP-tree:
null
Item Support
a 4 a:4
Because the support of a is above minsup {a, b} is a

frequent itemset.
In fact, {a, b} is the only frequent itemset with more than
one item produced by the algorithm (there are no other
frequent itemsets in the given dataset for minsup=3).
58 DMDW-3
Results
If there are more than one item with support above or
equal minsup in a conditional FP-tree then the algorithm
is run again against the conditional FP-tree to find
itemsets with more than two items.
For example, if the minsup=2 then from the c conditional
FP-tree the algorithm will produce {a, c} and {b, c}. Then
the same procedure may be run against this tree for
suffix bc, obtaining {a, b, c}. Also from the d conditional
FP-tree first {c, d}, {b, d} and {a, d} are obtained, and
then, for the suffix bd, {a, b, d} is obtained.
Suffix cd leads to unfrequent items in the FP-tree and
suffix ad produces {a, d}, already obtained.
59 DMDW-3
Road Map

Apriori algorithm
FP-Growth
Data formats
60 DMDW-3
Data formats
Table format
In this case a dataset is stored in a two columns
table:
Transactions(Transaction-ID, Item) or
T(TID, Item)
where all the lines of a transaction have the same
TID and the primary key contains both columns
(so T does not contain duplicate rows).
61 DMDW-3
Data formats
Text file format
In that case the dataset is a textfile containing a
transaction per line. Each line may contain a
transaction ID (TID) as the first element or this TID
may be missing, the line number being a virtual TID.
Example 8:
10 12 34 67 78 45 89 23 67 90 → line 1
789 12 45 678 34 56 32 → line 2
........
Also in this case any software package must either
have a native textfile input option or must contain a
conversion module from text to the needed format
62 DMDW-3
Data formats
Custom format
Many data mining packages use a custom format for the input
data.
An example is the ARFF format used by Weka, presented
below. Weka means Waikato Environment for Knowledge
Analysis) is a popular open source suite of machine
learning software developed at the University of Waikato, New
Zealand.
ARFF stands for Atribute-Relation File Format. An .arff file is
an ASCII file containing a table (called also relation). The file
has two parts:
A Header part containing the relation name, the list of
attributes and their types.
A Data part containing the row values of the relation,
comma separated.
63 DMDW-3
ARFF example
% 1. Title: Iris Plants Database
%
% 2. Sources:
% (a) Creator: R.A. Fisher
% (b) Donor: Michael Marshall MARSHALL%[email protected])
% (c) Date: July, 1988
%
@RELATION iris
@ATTRIBUTE sepallength NUMERIC
@ATTRIBUTE sepalwidth NUMERIC
@ATTRIBUTE petallength NUMERIC
@ATTRIBUTE petalwidth NUMERIC
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}
@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa Florin Radulescu, Note de curs
4.9,3.1,1.5,0.1,Iris-setosa
64 DMDW-3
Road Map

Apriori algorithm
FP-Growth
Data formats
65 DMDW-3
Class association rules (CARs)
As for clasical association rules, the model for CARs
considers a set of items I = {i1, i2, …, in} and a set of
transactions T = {t1, t2, …, tm}. The difference is that
each transaction is labeled with a class label c, where
c ∈ C, with C containing all class labels and C ∩ I = ∅.
A class association rule is a construction with the
following syntax:
X→y
where X ⊆ I and y ∈ C.
The definition of the support and confidence for a
class association rule is the same with the case of
association rules.
66 DMDW-3
Example 10
Consider the set of six transactions in Example 2, now
labeled with class labels from C = {database,
datamining, programming}:
Doc1 {rule, tree, classification} datamining
Doc2 {relation, tuple, join, algebra, recommendation} database
Doc3 {variable, loop, procedure, rule} programming
Doc4 {clustering, rule, tree, recommendation} datamining
Doc5 {join, relation, selection, projection, classification} database
Doc6 {rule, tree, recommendation} datamining
67 DMDW-3
Example 10
Then the CARs:
rule → datamining;
recommendation → database
has:
sup(rule → datamining) = 3/6 = 50%,
conf(rule → datamining) = 3/4 = 75%.
sup(recommendation → database) = 1/6 ≅ 17%,
conf(recommendation → database) = 1/3 ≅ 33%
For a minsup=50% and a minconf=50% the first rule
stands and the second is rejected.
68 DMDW-3
Mining CARs
Algorithm for mining CARs using a modified Apriori
algorithm (see [Liu 11] ):
At the first pass over the algorithm computes F1 where
F1= { the set of CARs with a single item on the left side
verifying a given minsup and minconf}.
At step k, Ck is built from Fk-1 and then, passing
through the data and counting for each member of Ck
the support and the confidence, Fk is determined.
Candidate generation is almost the same as for
association rules with the only difference that in the
join step only CARs with the same class in the right
side are joined.
69 DMDW-3
Candidates generation
Ck = ∅ // starts with an empty set

forall f1, f2 ∈ Fk-1 // for each pair of frequent CAR
f1 = {i1, … , ik-2, ik-1} → y // only last item
f2 = {i1, … , ik-2, i’k-1} → y // is different
ik-1 < i’k-1 do // and same class
c = {i1, …, ik-1, i’k-1} → y; // join step

Ck = Ck ∪ {c}; // add new candidate
for each (k-1)-subset s of {i1, …, ik-1, i’k-1} do

if (s → y ∉ Fk-1) then
Ck = Ck - {c}; // prune step
endfor
endfor
70 DMDW-3
Road Map

Apriori algorithm
FP-Growth
Data formats
71 DMDW-3
Sequential patterns model
Itemset: a set of n distinct items
I = {i1, i2, …, in }
Event: a non-empty collection of items; we can
assume that items are in a given (e.g.
lexicographic) order: (i1,i2 … ik)
Sequence : an ordered list of events: < e1 e2 …
em >
Length of a sequence: the number of items in
the sequence
Example: <AM, CDE, AE> has length 7
72 DMDW-3
Size of a sequence: the number of itemsets in the
sequence
Example: <AM, CDE, AE> has size 3
K-sequence : sequence with k items, or with
length k
Example: <B, AC> is a 3-sequence
Subsequence and supersequence: <e1 e2 …eu>
is a subsequence of or included in <f1 f2 …fv> (and
the last is a supersequence of the first sequence
or contains that sequence) if there are some
integers 1 ≤ j1 < j2 < … < ju-1 < ju ≤ v such that e1
⊆ fj1 and e2 ⊆ fj2 and … and eu ⊆ fju.
73 DMDW-3
Sequence database X: a set of sequences
Frequent sequence (or sequential pattern):
a sequence included in more than s
members of the sequence database X;
s is the user-specified minimum support.
The number of sequences from X containing
a given sequence is called the support of
that sequence.
So, a frequent sequence is a sequence with
a support at least s where s is the minsup
specified by the user.
74 DMDW-3
Example 11
<A, BC> is a subsequence of <AB, E, ABCD>
<AB, C> is not a subsequence of <ABC>
Consider a minsup=50% and the following sequence
database:
Sequence ID Sequence
1 <A, B, C>
2 <AB, C, AD>
3 <ABC, BCE>
4 <AD, BC, AE>
5 <B, E>
75 DMDW-3
Example 11
The frequent sequences (support at least 50%

means 3 of 5) are:
1-sequences <A>, <B>, <C>, <E>
2-sequences <A, B>, <A, C>, <B, C>, <B, E>
There is no 3-sequence (or upper) with support

at least 50%.
76 DMDW-3
Algorithms
Apriori
GSP (Generalized Sequential Pattern)
FreeSpan (Frequent pattern-projected
Sequential pattern mining)
PrefixSpan (Prefix-projected Sequential
pattern mining)
SPADE (Sequential PAttern Discovery
using Equivalence classes)
77 DMDW-3
GSP Algorithm
Similar with Apriori:

Algorithm GSP(I, X, minsup)
C1 = I // initial n candidates
L1 = {<{f}>| f∈∈ C1, f.count/n ≥ minsup}; // first pass over X
for (k = 2; Lk-1 ≠ ∅; k++) do // loop until Lk-1 is empty
Ck = candidate-generation(Lk-1);
foreach s ∈ X do //
foreach c ∈ Ck do
if c is-contained-in s then
c.count++;
endfor
endfor
Lk = {c ∈ Ck | c.count/n ≥ minsup}
endfor
return ∪k Fk;
78 DMDW-3
GSP Algorithm
Candidate generation is made in a join and

prune manner.
At the join step two sequences f1 and f2 from Lk-1
are joined if removing the first item from f1 and
the last item from f2 the result is the same.
The joined sequence is obtained by adding the
last item of f2 to f1, with the same status
(separate element or part of the last element of
f1).
79 DMDW-3
Example 12
<AB, CD, E> join with <B, CD, EF>:

<AB, CD, EF>
<AB, CD, E> join with <B, CD, E, F>:
<AB, CD, E, F>
80 DMDW-3
Summary
This third course presented:
What are frequent itemsets and rules and their
relationship
Apriori and FP-growth algorithms for discovering
frequent itemsets.
Data formats for discovering frequent itemsets
What are class association rules and how can be
mined
An introduction to sequential patterns and the GSP
algorithm
Next week: Supervised learning – part 1.
81 DMDW-3
References
[Agrawal, Imielinski, Swami 93] R. Agrawal; T. Imielinski; A. Swami:
Mining Association Rules Between Sets of Items in Large Databases",
SIGMOD Conference 1993: 207-216, (http://rakesh.agrawal-
family.com/papers/sigmod93assoc.pdf)
[Agrawal, Srikant 94] Rakesh Agrawal and Ramakrishnan Srikant. Fast
algorithms for mining association rules in large databases. Proceedings
of the 20th International Conference on Very Large Data Bases, VLDB,
pages 487-499, Santiago, Chile, September 1994
(http://rakesh.agrawal-family.com/papers/vldb94apriori.pdf)
[Srikant, Agrawal 96] R. Srikant, R. Agrawal: "Mining Sequential
Patterns: Generalizations and Performance Improvements", to appear
in Proc. of the Fifth Int'l Conference on Extending Database Technology
(EDBT), Avignon, France, March 1996, (http://rakesh.agrawal-
family.com/papers/edbt96seq.pdf)
[Liu 11] Bing Liu, 2011. Web Data Mining, Exploring Hyperlinks,
Contents, and Usage Data, Second Edition, Springer, chapter 2.
[Ullman 03-09] Jeffrey Ullman, Data Mining Lecture Notes, 2003-2009,
web page: http://infolab.stanford.edu/~ullman/mining/mining.html
82 DMDW-3
References
[Wikipedia] Wikipedia, the free encyclopedia, en.wikipedia.org
[Whitehorn 06] Mark Whitehorn, The parable of the beer and diapers,
web page: http://www.theregister.co.uk/2006/08/15/beer_diapers/
[Silverstein et al. 00] Silverstein, C., Brin, S., Motwani, R., Ullman, J. D.
2000. Scalable techniques for mining causal structures. Data Mining
Knowl. Discov. 4, 2–3, 163–192., www.vldb.org/conf/1998/p594.pdf
[Verhein 08] Florian Verhein, Frequent Pattern Growth (FP-Growth)
Algorithm, An Introduction, 2008,
http://www.florian.verhein.com/teaching/2008-01-09/fp-growth-
presentation_v1%20(handout).pdf
[Pietracaprina, Zandolin 03] Andrea Pietracaprina and Dario Zandolin:
Mining Frequent Itemsets using Patricia Tries,
[Zhao, Bhowmick 03] Qiankun Zhao, Sourav S. Bhowmick, Sequential
Pattern Mining: A Survey, Technical Report, CAIS, Nanyang
Technological University, Singapore, No. 2003118 , 2003,
(http://cs.nju.edu.cn/zhouzh/zhouzh.files/course/dm/reading/reading04/
zhao_techrep03.pdf)
83 DMDW-3

Association Rules and Sequential Patterns Sequential Patterns

Uploaded by

Copyright:

Available Formats

Association Rules and Sequential Patterns Sequential Patterns

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Association Rules and Sequential Patterns Sequential Patterns

Uploaded by

Copyright:

Available Formats

Association Rules and

Frequent itemsets and rules

Florin Radulescu, Note de curs

Frequent itemsets and rules:

Florin Radulescu, Note de curs

Florin Radulescu, Note de curs

S = {keyboard, mouse, monitor, laptop} is a 4-itemset.

Florin Radulescu, Note de curs

Florin Radulescu, Note de curs

Florin Radulescu, Note de curs

Frequent itemsets and rules:

Florin Radulescu, Note de curs

If X and Y are two itemsets, an

The support of a rule X → Y is given by

If the support of a rule is high it

Florin Radulescu, Note de curs

Frequent itemsets and rules:

Florin Radulescu, Note de curs

The previous paragraph states that if the support

Find association rules. Such a rule tells that

Florin Radulescu, Note de curs

This example is only a Data Mining myth (as Daniel

Frequent itemsets and rules

This algorithm is introduced in 1994 in [Agrawal,

Florin Radulescu, Note de curs

It means that an algorithm for finding

The process stops in two cases:

Florin Radulescu, Note de curs

The prune step removes those candidates

Florin Radulescu, Note de curs

The join step

Florin Radulescu, Note de curs

The prune step

Florin Radulescu, Note de curs

At the first scan of the transaction dataset T the

With a minsup=3, L1 = { {rule}, {tree},

Florin Radulescu, Note de curs

Florin Radulescu, Note de curs

The only frequent pair is {rule, tree}. L2 = { {rule,

Apriori algorithm performs a level-wise search.

Florin Radulescu, Note de curs

Frequent itemsets and rules

Florin Radulescu, Note de curs

FP-Growth (Frequent Pattern Growth)

Florin Radulescu, Note de curs

At the first pass over the data frequent items are

Florin Radulescu, Note de curs

At the second pass over the data the FP-tree is

Florin Radulescu, Note de curs

Let’s consider the following transaction set

Florin Radulescu, Note de curs

Item counts. Only a, b, c, d are frequent:

Florin Radulescu, Note de curs

After discarding nonfrequent items and

Florin Radulescu, Note de curs

d:1 d:1 d:1

Florin Radulescu, Note de curs

Item Support null

Florin Radulescu, Note de curs