Association Rules and Sequential Patterns Sequential Patterns
Association Rules and Sequential Patterns Sequential Patterns
Association Rules and Sequential Patterns Sequential Patterns
Sequential Patterns
Prof.dr.ing. Florin Radulescu
Universitatea Politehnica din Bucureşti
Road Map
2 DMDW-3
Objectives
Association rule learning was introduced in the article
“Mining Association Rules Between Sets of Items in
Large Databases” by Agrawal, Imielinski and Swami (see
references), article presented at the 1993 SIGMOD
Conference (ACM SIGMOD means ACM Special
Interest Group on Management of Data).
One of the best known applications is finding
relationships (rules) between products as recorded by
POS systems in supermarkets.
For example, the statement that 85% of baskets that
contain bread contain also mineral water is a rule with
bread as antecedent and mineral water as consequent.
Florin Radulescu, Note de curs
3 DMDW-3
Objectives
The original article (Agrawal et al.) lists some
examples of the expected results:
Find all rules that have “Diet Coke” as
consequent,
Find all rules that have “bagels” in the
antecedent,
Find all rules that have “sausage” in the
antecedent and “mustard” in the consequent,
Find all the rules relating items located on
shelves A and B in the store,
Find the “best” k rules (considering rule support)
that have “bagels” in the consequent
Florin Radulescu, Note de curs
4 DMDW-3
Frequent itemsets and rules
5 DMDW-3
Items and transactions
Let I = {i1, i2, …, in} be a set of items. For example, items may
be all products sold in a supermarket or all words contained in
some documents.
A transaction t is a set of items, with t ⊆ I. Examples of
transactions are market baskets containing products or
documents containing words.
A transaction dataset (or database) T is a set of
transactions, T = {t1, t2, …, tm}. Each transaction may contain
a different number of items and the dataset may be stored in
a SGBD managed database or in a text file.
An itemset S is a subset of I. If v = |S| is the number of items
in S (or the cardinal of S), then S is called a v-itemset.
6 DMDW-3
Items and transactions
The support of an itemset X, sup(X), is equal to the
number (or proportion) of transactions in T containing
X. The support may be given as an absolute value
(number of transactions) or proportion or percent
(proportion of transactions).
In many cases we want to find itemsets with a support
greater or equal with a given value (percent) s. Such
an itemset is called frequent itemset and, in the
market basket example, it contains items that can be
found together in many baskets (where the measure
of ‘many’ is s).
These frequent itemsets are the source of all ‘powerful’
association rules.
Florin Radulescu, Note de curs
7 DMDW-3
Example 1
Let us consider a dataset containing market basket
transactions:
I = {laptop, mouse, tablet, hard-drive, monitor,
keyboard, DVD-drive, CD-drive, flash-memory, . . .}
T = {t1, t2, …, tm}
t1 = {laptop, mouse, tablet}
t2 = {hard-drive, monitor, laptop, keyboard, DVD-drive}
. . .
tm = {keyboard, mouse, tablet)
8 DMDW-3
Example 2
If items are words and transactions are documents,
where each document is considered a bag of words,
then we can have:
T = {Doc1, Doc2, …, Doc6}
Doc1 = {rule, tree, classification}
Doc2 = {relation, tuple, join, algebra, recommendation}
Doc3 = {variable, loop, procedure, rule}
Doc4 = {clustering, rule, tree, recommendation}
Doc5 = {join, relation, selection, projection,
classification}
Doc6 = {rule, tree, recommendation}
9 DMDW-3
Example 2
In that case:
sup({rule, tree}) = 3 or 50% or 0.5
sup({relation, join}) = 2 or 33.33% or 1/3
If the threshold is s = 50% (or 0.5) then {rule, tree} is
frequent and {relation, join} is not.
10 DMDW-3
Frequent itemsets and rules
11 DMDW-3
Association rules
12 DMDW-3
Association rules
If m is the number of transactions in T then:
sup(X → Y) = sup(X ∪ Y) - as absolute value or
sup(X → Y) = sup(X ∪ Y) / m - as proportion
conf(X → Y) = sup(X ∪ Y) / sup(X)
where the support of an itemset is given as absolute value
(number of transactions).
Trans. Containing
X∪Y
Trans. Containing X
T, |T| = m
Florin Radulescu, Note de curs
13 DMDW-3
Association rules
14 DMDW-3
Association rules
15 DMDW-3
Example 3
Consider the set of six transactions in Example 2:
Doc1 = {rule, tree, classification}
Doc2 = {relation, tuple, join, algebra,
recommendation}
Doc3 = {variable, loop, procedure, rule}
Doc4 = {clustering, rule, tree, recommendation}
Doc5 = {join, relation, selection, projection,
classification}
Doc6 = {rule, tree, recommendation}
16 DMDW-3
Example 3
With a minimum support of 50% we find that
{rule, tree} is a frequent itemset. The two rules
derived from this itemset have the same
minimum support:
rule → tree
with sup = 50% and conf = 3 / 4 = 75% and
tree → rule
with sup = 50% and conf = 3 / 3 = 100%
If the minimum confidence required is 80% then
only the second rule is kept, the first being
considered not enough powerful.
Florin Radulescu, Note de curs
17 DMDW-3
Frequent itemsets and rules
18 DMDW-3
Finding association rules
19 DMDW-3
Finding association rules
That means that the process of finding all the rules
given the minimum support and the minimum
confidence has three steps:
Step 1. Find all frequent itemsets containing at least
two items, considering the given minimum support
minsup.
Step 2. For each frequent itemset U found in step 1
list all splits (X, Y) with X∩Y=∅ and X∪Y=U. Each
split generates a rule X → Y.
Step 3. Compute the confidence of each rule. Keep
only the rules with confidence at least minconf.
Florin Radulescu, Note de curs
20 DMDW-3
Goals for mining transactions
Goal 1: Find frequent itemsets. Frequent
itemsets can be used not only to find rules but also
for marketing purposes.
As an example, in a supermarket, frequent
itemsets helps marketers to place items in an
effort to control the way customers walk through
the store:
Items that are sold together are placed for
example in distant corners of the store such that
customers must go from one product to another
possibly putting other products in the basket on
the way.
Florin Radulescu, Note de curs
21 DMDW-3
Goal 2
22 DMDW-3
Diapers → Beer
In [Whitehorn 06] this example is described as follows:
“Some time ago, Wal-Mart decided to combine the
data from its loyalty card system with that from its
point of sale systems.
The former provided Wal-Mart with demographic data
about its customers, the latter told it where, when and
what those customers bought.
Once combined, the data was mined extensively and
many correlations appeared.
Some of these were obvious; people who buy gin are
also likely to buy tonic. They often also buy lemons.
However, one correlation stood out like a sore thumb
because it was so unexpected.
Florin Radulescu, Note de curs
23 DMDW-3
Diapers → Beer
On Friday afternoons, young American males who
buy diapers (nappies) also have a predisposition
to buy beer.
No one had predicted that result, so no one would
ever have even asked the question in the first place.
Hence, this is an excellent example of the difference
between data mining and querying.”
24 DMDW-3
Goal 3
In [Ullman 03-09] is listed also a third goal for
mining transactions:
Goal 3: Find causalities. In the case of the rule
Diapers → Beer a natural question is if the left
part of the rule (buying diapers) causes the right
part (buy also beer).
Causal rules can be used in marketing: a low
price of diapers will attract diaper buyers and an
increase of the beer price will grow the overall
sales numbers.
Florin Radulescu, Note de curs
25 DMDW-3
Algorithms
There are many algorithms for finding frequent
itemsets and consequently the association rules
in a dataset.
All these algorithms are developed for huge
volumes of data, meaning that the dataset is too
large to be loaded and processed in the main
memory.
For that reason minimizing the number of times
the data are read from the disk become a key
feature of each algorithm.
Florin Radulescu, Note de curs
26 DMDW-3
Road Map
27 DMDW-3
Apriori algorithm
28 DMDW-3
Apriori principle
The Apriori principle states that any subset of a
frequent itemset is also a frequent itemset.
Example 4: If {1, 2, 3, 4} is a frequent itemset then all
its four subsets with 3 values are also frequent: {1,
2, 3}, {1, 2, 4}, {1, 3, 4} and {2, 3, 4}.
Consequently each frequent v-itemset is the reunion
of v (v-1)-itemsets.
That means we can determine the frequent itemsets
with dimension v examining only the set of all
frequent itemsets with dimension (v-1).
Florin Radulescu, Note de curs
29 DMDW-3
Apriori principle
30 DMDW-3
Apriori principle
It is a level-wise approach where each step
requires a full scan of the dataset (residing on
disk).
A diagram is presented in the next slide where Ci
is the set of candidates for frequent i-itemsets
and Li is the actual set of frequent i-itemsets.
C1 is the set of all itemsets found in transactions
(a subset of I) and may be obtained either by a
reunion of all transactions in T or by considering
C1 = I (in that case some items may have a zero
support)
Florin Radulescu, Note de curs
31 DMDW-3
Using Apriori Principle
C1 L1
C2 L2
C3 L3
Ck Lk
32 DMDW-3
Apriori Algorithm
The algorithm is described in [Agrawal, Srikant 94] and
uses the level-wise approach described before:
A first scan of the dataset leads to the L1 (the set of
frequent items). For each transaction t in T and for
each item a in t the count of a is increased
(a.count++). At the end of the scan L1 will contain all
items with a count at least minsup (given as absolute
value).
For k=2, 3, … the process continues by generating the
set of candidates Ck and then counting the support of
each candidate by a full scan of the dataset.
Process ends when Lk is empty.
33 DMDW-3
Apriori Algorithm
Algorithm Apriori (T)
L1 = scan (T);
for (k = 2; Lk-1 ≠ ∅; k++) do
Ck ← apriori-gen(Lk-1);
for each transaction t ∈ T do
for each candidate c ∈ Ck do
if c is contained in t then
c.count++;
end
end
Lk ← {c ∈ Ck | c.count ≥ minsup}
end
return L = ∪ Lk;
Florin Radulescu, Note de curs
34 DMDW-3
Candidate generation
Candidate generation is also described in the
original algorithm as having two steps: the join
step and the prune step. The first builds a larger
set of candidates and the last removes some of
them proved impossible to be frequent.
In the join step each candidate is obtained from
two different frequent (k-1)-itemsets containing
(k-2) identical items:
Ck = { {i1, …, ik-1, i’k-1} | ∃p={i1, …, ik-1} ∈ Lk-1 ,
∃q={i1, …, i’k-1}∈ Lk-1 , ik-1 < i’k-1}
Florin Radulescu, Note de curs
35 DMDW-3
Candidate generation
36 DMDW-3
Join
37 DMDW-3
Join and prune
38 DMDW-3
Example 5
Consider again the set of six transactions in
Example 2:
Doc1 = {rule, tree, classification}
Doc2 = {relation, tuple, join, algebra,
recommendation}
Doc3 = {variable, loop, procedure, rule}
Doc4 = {clustering, rule, tree, recommendation}
Doc5 = {join, relation, selection, projection,
classification}
Doc6 = {rule, tree, recommendation}
and a minimum support of 50% (minsup=3).
Florin Radulescu, Note de curs
39 DMDW-3
Step 1
40 DMDW-3
Step 2
Considering:
rule < tree < recommendation
From the join C2 = { {rule, tree}, {rule,
recommendation}, {tree, recommendation} }.
The prune step does not modify C2.
The second scan of the transaction dataset
leads to the following pair support values:
41 DMDW-3
Step 2
Step 2 {rule, tree} 3
{rule, recommendation} 2
{tree, recommendation} 2
42 DMDW-3
Example 6
Consider the transaction dataset {(1, 2, 3, 5), (2, 3,
4), (3, 4, 5)} and the minsup s = 50% (or s = 3/2;
because s must be an integer ⇒ s = 2)
C1 = {1, 2, 3, 4, 5}
L1 = {2, 3, 4, 5}
C2 = {(2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5) }
L2 = {(2, 3), (3, 4), (3, 5)}
After the join step C3 = {(3, 4, 5)} - obtained by
joining (3, 4) and (3, 5).
In the prune step (3, 4, 5) is removed because
its subset (4, 5) is not in L2
Florin Radulescu, Note de curs
43 DMDW-3
Example 6
After the prune step C3 = ∅, so L3 = ∅, and the
process stops. L = L1 ∪ L2 = {(2), (3), (4), (5), (2,
3), (3, 4), (3, 5)} or as maximal itemsets L = L2.
The rules generated from itemsets are:
2 → 3, 3 → 2, 3 → 4, 4 → 3, 3 → 5, 5 → 3.
The support of these rules is at least 50%.
Considering a minconf equal to 80% only 3 rules
have a confidence greater or equal with minconf:
2 → 3, 4 → 3, 5 → 3 (with conf = 2/2 = 100%).
The rules having 3 in the antecedent have a
confidence of 67% (because sup(3) = 3 and sup(2,
3) = sup(3, 4) = sup(3, 5) = 2).
Florin Radulescu, Note de curs
44 DMDW-3
Apriori summary
45 DMDW-3
Road Map
46 DMDW-3
FP-Growth
47 DMDW-3
Build the FP-tree
48 DMDW-3
Build the FP-tree
49 DMDW-3
Build the FP-tree
Pass 2 - cont.:
Each node has a counter.
If two transactions have the same prefix the two
branches overlap on the nodes of the common
prefix and the counter of those nodes are
incremented.
Also, nodes with the same item are linked by
orthogonal paths.
50 DMDW-3
Example 7
1 a, b, c
2 a, b, c, d
3 a, b, f
4 a, b, d
5 c, d, e
51 DMDW-3
Example 7
a 4
b 4
c 3
d 3
e 1
f 1
52 DMDW-3
Example 7
1 a, b, c
2 a, b, c, d
3 a, b
4 a, b, d
5 c, d
53 DMDW-3
Example 7
null
a:4
Item Support
a 4 b:4
b 4
c 3 c:2 c:1
d 3
54 DMDW-3
Extract frequent itemsets
• After building the FP-tree the algorithm starts to build
partial trees (called conditional FP-trees) ending with a
given item (a suffix).
• The item is not present in the tree but all frequent
itemsets generated from that conditional tree will contain
that item.
• In building the conditional FP-tree, non-frequent items
are skipped (but the branch remains if there are still
nodes on it).
• In the previous example trees ending with d, c, b and a
may be built.
• For each tree the dotted path is used for starting points
and the algorithm goes up propagating the counters
(with sums where two branches go to the same node).
Florin Radulescu, Note de curs
55 DMDW-3
d conditional FP-tree
a 2 a:2
b 2
c 2 b:2
c:1 c:1
56 DMDW-3
c conditional FP-tree
a 2 a:2
b 2
b:2
57 DMDW-3
b conditional FP-tree
b conditional FP-tree:
null
Item Support
a 4 a:4
58 DMDW-3
Results
If there are more than one item with support above or
equal minsup in a conditional FP-tree then the algorithm
is run again against the conditional FP-tree to find
itemsets with more than two items.
For example, if the minsup=2 then from the c conditional
FP-tree the algorithm will produce {a, c} and {b, c}. Then
the same procedure may be run against this tree for
suffix bc, obtaining {a, b, c}. Also from the d conditional
FP-tree first {c, d}, {b, d} and {a, d} are obtained, and
then, for the suffix bd, {a, b, d} is obtained.
Suffix cd leads to unfrequent items in the FP-tree and
suffix ad produces {a, d}, already obtained.
59 DMDW-3
Road Map
60 DMDW-3
Data formats
Table format
In this case a dataset is stored in a two columns
table:
Transactions(Transaction-ID, Item) or
T(TID, Item)
where all the lines of a transaction have the same
TID and the primary key contains both columns
(so T does not contain duplicate rows).
61 DMDW-3
Data formats
Text file format
In that case the dataset is a textfile containing a
transaction per line. Each line may contain a
transaction ID (TID) as the first element or this TID
may be missing, the line number being a virtual TID.
Example 8:
10 12 34 67 78 45 89 23 67 90 → line 1
789 12 45 678 34 56 32 → line 2
........
Also in this case any software package must either
have a native textfile input option or must contain a
conversion module from text to the needed format
Florin Radulescu, Note de curs
62 DMDW-3
Data formats
Custom format
Many data mining packages use a custom format for the input
data.
An example is the ARFF format used by Weka, presented
below. Weka means Waikato Environment for Knowledge
Analysis) is a popular open source suite of machine
learning software developed at the University of Waikato, New
Zealand.
ARFF stands for Atribute-Relation File Format. An .arff file is
an ASCII file containing a table (called also relation). The file
has two parts:
A Header part containing the relation name, the list of
attributes and their types.
A Data part containing the row values of the relation,
comma separated.
Florin Radulescu, Note de curs
63 DMDW-3
ARFF example
% 1. Title: Iris Plants Database
%
% 2. Sources:
% (a) Creator: R.A. Fisher
% (b) Donor: Michael Marshall MARSHALL%[email protected])
% (c) Date: July, 1988
%
@RELATION iris
@ATTRIBUTE sepallength NUMERIC
@ATTRIBUTE sepalwidth NUMERIC
@ATTRIBUTE petallength NUMERIC
@ATTRIBUTE petalwidth NUMERIC
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}
@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa Florin Radulescu, Note de curs
4.9,3.1,1.5,0.1,Iris-setosa
64 DMDW-3
Road Map
65 DMDW-3
Class association rules (CARs)
As for clasical association rules, the model for CARs
considers a set of items I = {i1, i2, …, in} and a set of
transactions T = {t1, t2, …, tm}. The difference is that
each transaction is labeled with a class label c, where
c ∈ C, with C containing all class labels and C ∩ I = ∅.
A class association rule is a construction with the
following syntax:
X→y
where X ⊆ I and y ∈ C.
The definition of the support and confidence for a
class association rule is the same with the case of
association rules.
66 DMDW-3
Example 10
Consider the set of six transactions in Example 2, now
labeled with class labels from C = {database,
datamining, programming}:
Doc1 {rule, tree, classification} datamining
Doc2 {relation, tuple, join, algebra, recommendation} database
Doc3 {variable, loop, procedure, rule} programming
Doc4 {clustering, rule, tree, recommendation} datamining
Doc5 {join, relation, selection, projection, classification} database
Doc6 {rule, tree, recommendation} datamining
67 DMDW-3
Example 10
Then the CARs:
rule → datamining;
recommendation → database
has:
sup(rule → datamining) = 3/6 = 50%,
conf(rule → datamining) = 3/4 = 75%.
sup(recommendation → database) = 1/6 ≅ 17%,
conf(recommendation → database) = 1/3 ≅ 33%
For a minsup=50% and a minconf=50% the first rule
stands and the second is rejected.
Florin Radulescu, Note de curs
68 DMDW-3
Mining CARs
Algorithm for mining CARs using a modified Apriori
algorithm (see [Liu 11] ):
At the first pass over the algorithm computes F1 where
F1= { the set of CARs with a single item on the left side
verifying a given minsup and minconf}.
At step k, Ck is built from Fk-1 and then, passing
through the data and counting for each member of Ck
the support and the confidence, Fk is determined.
Candidate generation is almost the same as for
association rules with the only difference that in the
join step only CARs with the same class in the right
side are joined.
69 DMDW-3
Candidates generation
70 DMDW-3
Road Map
71 DMDW-3
Sequential patterns model
Itemset: a set of n distinct items
I = {i1, i2, …, in }
Event: a non-empty collection of items; we can
assume that items are in a given (e.g.
lexicographic) order: (i1,i2 … ik)
Sequence : an ordered list of events: < e1 e2 …
em >
Length of a sequence: the number of items in
the sequence
Example: <AM, CDE, AE> has length 7
Florin Radulescu, Note de curs
72 DMDW-3
Sequential patterns model
Size of a sequence: the number of itemsets in the
sequence
Example: <AM, CDE, AE> has size 3
K-sequence : sequence with k items, or with
length k
Example: <B, AC> is a 3-sequence
Subsequence and supersequence: <e1 e2 …eu>
is a subsequence of or included in <f1 f2 …fv> (and
the last is a supersequence of the first sequence
or contains that sequence) if there are some
integers 1 ≤ j1 < j2 < … < ju-1 < ju ≤ v such that e1
⊆ fj1 and e2 ⊆ fj2 and … and eu ⊆ fju.
Florin Radulescu, Note de curs
73 DMDW-3
Sequential patterns model
Sequence database X: a set of sequences
Frequent sequence (or sequential pattern):
a sequence included in more than s
members of the sequence database X;
s is the user-specified minimum support.
The number of sequences from X containing
a given sequence is called the support of
that sequence.
So, a frequent sequence is a sequence with
a support at least s where s is the minsup
specified by the user.
Florin Radulescu, Note de curs
74 DMDW-3
Example 11
<A, BC> is a subsequence of <AB, E, ABCD>
<AB, C> is not a subsequence of <ABC>
Consider a minsup=50% and the following sequence
database:
Sequence ID Sequence
1 <A, B, C>
2 <AB, C, AD>
3 <ABC, BCE>
5 <B, E>
75 DMDW-3
Example 11
76 DMDW-3
Algorithms
Apriori
GSP (Generalized Sequential Pattern)
FreeSpan (Frequent pattern-projected
Sequential pattern mining)
PrefixSpan (Prefix-projected Sequential
pattern mining)
SPADE (Sequential PAttern Discovery
using Equivalence classes)
Florin Radulescu, Note de curs
77 DMDW-3
GSP Algorithm
78 DMDW-3
GSP Algorithm
79 DMDW-3
Example 12
80 DMDW-3
Summary
This third course presented:
What are frequent itemsets and rules and their
relationship
Apriori and FP-growth algorithms for discovering
frequent itemsets.
Data formats for discovering frequent itemsets
What are class association rules and how can be
mined
An introduction to sequential patterns and the GSP
algorithm
Next week: Supervised learning – part 1.
Florin Radulescu, Note de curs
81 DMDW-3
References
[Agrawal, Imielinski, Swami 93] R. Agrawal; T. Imielinski; A. Swami:
Mining Association Rules Between Sets of Items in Large Databases",
SIGMOD Conference 1993: 207-216, (http://rakesh.agrawal-
family.com/papers/sigmod93assoc.pdf)
[Agrawal, Srikant 94] Rakesh Agrawal and Ramakrishnan Srikant. Fast
algorithms for mining association rules in large databases. Proceedings
of the 20th International Conference on Very Large Data Bases, VLDB,
pages 487-499, Santiago, Chile, September 1994
(http://rakesh.agrawal-family.com/papers/vldb94apriori.pdf)
[Srikant, Agrawal 96] R. Srikant, R. Agrawal: "Mining Sequential
Patterns: Generalizations and Performance Improvements", to appear
in Proc. of the Fifth Int'l Conference on Extending Database Technology
(EDBT), Avignon, France, March 1996, (http://rakesh.agrawal-
family.com/papers/edbt96seq.pdf)
[Liu 11] Bing Liu, 2011. Web Data Mining, Exploring Hyperlinks,
Contents, and Usage Data, Second Edition, Springer, chapter 2.
[Ullman 03-09] Jeffrey Ullman, Data Mining Lecture Notes, 2003-2009,
web page: http://infolab.stanford.edu/~ullman/mining/mining.html
Florin Radulescu, Note de curs
82 DMDW-3
References
[Wikipedia] Wikipedia, the free encyclopedia, en.wikipedia.org
[Whitehorn 06] Mark Whitehorn, The parable of the beer and diapers,
web page: http://www.theregister.co.uk/2006/08/15/beer_diapers/
[Silverstein et al. 00] Silverstein, C., Brin, S., Motwani, R., Ullman, J. D.
2000. Scalable techniques for mining causal structures. Data Mining
Knowl. Discov. 4, 2–3, 163–192., www.vldb.org/conf/1998/p594.pdf
[Verhein 08] Florian Verhein, Frequent Pattern Growth (FP-Growth)
Algorithm, An Introduction, 2008,
http://www.florian.verhein.com/teaching/2008-01-09/fp-growth-
presentation_v1%20(handout).pdf
[Pietracaprina, Zandolin 03] Andrea Pietracaprina and Dario Zandolin:
Mining Frequent Itemsets using Patricia Tries,
[Zhao, Bhowmick 03] Qiankun Zhao, Sourav S. Bhowmick, Sequential
Pattern Mining: A Survey, Technical Report, CAIS, Nanyang
Technological University, Singapore, No. 2003118 , 2003,
(http://cs.nju.edu.cn/zhouzh/zhouzh.files/course/dm/reading/reading04/
zhao_techrep03.pdf)
Florin Radulescu, Note de curs
83 DMDW-3