VLDB 92

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Knowledge Discovery in Databases: An Attribute-Oriented Approach

† †
Jiawei Han , Yandong Cai, and Nick Cercone
School of Computing Science
Simon Fraser University
Burnaby, British Columbia, Canada V5A 1S6
{ han, cai, nick }@cs.sfu.ca

Abstract number of existing databases far exceeds human abilities


to analyze such data, thus creating both a need and an
Knowledge discovery in databases, or data mining, opportunity for extracting knowledge from databases.
is an important issue in the development of data- and Recently, data mining has been ranked as one of the most
knowledge-base systems. An attribute-oriented induction promising research topics for the 1990s by both database
method has been developed for knowledge discovery in and machine learning researchers [7, 20].
databases. The method integrates a machine learning
In our previous studies [1, 10], an attribute-oriented
paradigm, especially learning-from-examples techniques,
induction method has been developed for knowledge
with set-oriented database operations and extracts gen-
discovery in relational databases. The method integrates
eralized data from actual data in databases. An
a machine learning paradigm, especially learning-from-
attribute-oriented concept tree ascension technique is
examples techniques, with database operations and
applied in generalization, which substantially reduces the
extracts generalized data from actual data in databases.
computational complexity of database learning
A key to our approach is the attribute-oriented concept
processes. Different kinds of knowledge rules, including
tree ascension for generalization which applies well-
characteristic rules, discrimination rules, quantitative
developed set-oriented database operations and substan-
rules, and data evolution regularities can be discovered
tially reduces the computational complexity of the data-
efficiently using the attribute-oriented approach. In addi-
base learning processes.
tion to learning in relational databases, the approach can
be applied to knowledge discovery in nested relational In this paper, the attribute-oriented approach is
and deductive databases. Learning can also be per- developed further for learning different kinds of
formed with databases containing noisy data and excep- knowledge rules, including characteristic rules, discrimi-
tional cases using database statistics. Furthermore, the nation rules, quantitative rules, and data evolution regu-
rules discovered can be used to query database larities. Moreover, in addition to learning from relational
knowledge, answer cooperative queries and facilitate databases, this approach is extended to knowledge
semantic query optimization. Based upon these princi- discovery in other kinds of databases, such as nested rela-
ples, a prototyped database learning system, DBLEARN, tional and deductive databases. Learning can also be per-
has been constructed for experimentation. formed with databases containing noisy data and excep-
tional cases using database statistics. Furthermore, the
1. Introduction rules discovered can be used for querying database
knowledge, cooperative query answering and semantic
Knowledge discovery is the nontrivial extraction of
query optimization.
implicit, previously unknown, and potentially useful
information from data [7]. The growth in the size and The paper is organized as follows. Primitives for
 knowledge discovery in databases are introduced in Sec-
Permission to copy without fee all or part of this material is granted tion 2. The principles of attribute-oriented induction are
provided that the copies are not made or distributed for direct commer- presented in Section 3. The discovery of different kinds
cial advantage, the VLDB copyright notice and the title of the publica-
tion and its date appear, and notice is given that copying is by permis-
of knowledge rules in relational systems is considered in

sion of the Very Large Data Base Endowment. To copy otherwise, or to † The work was supported in part by the Natural Sciences and En-
republish, requires a fee and/or special permission from the Endow-
ment. gineering Research Council of Canada under Research grants
OPG0037230/OPG0043090 and a research grant from Centre for Sys-
Proceedings of the 18th VLDB Conference tems Science of Simon Fraser University.
Vancouver, British Columbia, Canada 1992
Section 4. Extension of the method to extended- current data instances from the previous ones if it is a
relational databases is discussed in Section 5. A com- discrimination rule. If quantitative measurement is asso-
parison of our approach with other learning algorithms is ciated with a learned rule, the rule is called a quantita-
contained in Section 6. The application of discovered tive rule.
rules to enhance system performance is discussed in Sec- In learning a characteristic rule, relevant data are
tion 7, and we summarize our investigation in Section 8. collected into one class, the target class, for generaliza-
tion. In learning a discrimination rule, it is necessary to
2. Primitives for Knowledge Discovery in Databases collect data into two classes, the target class and the con-
Three primitives should be provided for the trasting class(es). The data in the contrasting class(es)
specification of a learning task: task-relevant data, back- imply that such data cannot be used to distinguish the tar-
ground knowledge, and the expected representations of get class from the contrasting ones, that is, they are used
learning results. For illustrative purposes, we first exam- to exclude the properties shared by both classes.
ine relational databases, however, the results are general-
ized to other kinds of databases in later discussions. 2.2. Background knowledge
Concept hierarchies represent necessary back-
2.1. Data relevant to the discovery process ground knowledge which controls the generalization pro-
A database usually stores a large amount of data, of cess. Different levels of concepts are often organized
which only a portion may be relevant to a specific learn- into a taxonomy of concepts. The concept taxonomy can
ing task. For example, to characterize the features of be partially ordered according to a general-to-specific
graduate students in science, only the data relevant to ordering. The most general concept is the null descrip-
graduates in science are appropriate in the learning pro- tion (described by a reserved word "ANY"), and the most
cess. Relevant data may extend over several relations. A specific concepts correspond to the specific values of
query can be used to collect task-relevant data from the attributes in the database [11]. Using a concept hierar-
database. chy, the rules learned can be represented in terms of gen-
Task-relevant data can be viewed as examples for eralized concepts and stated in a simple and explicit
learning processes. Undoubtedly, learning-from- form, which is desirable to most users.
examples [9, 14] should be an important strategy for { biology, chemistry, computing, ..., physics } ⊂ science
knowledge discovery in databases. Most learning-from- { literature , music, ..., painting } ⊂ art
examples algorithms partition the set of examples into { science, art } ⊂ ANY (major)
{ freshman, sophomore, junior, senior } ⊂ undergraduate
positive and negative sets and perform generalization
{ M.S., M.A., Ph.D. } ⊂ graduate
using the positive data and specialization using the nega- { undergraduate, graduate } ⊂ ANY (status)
tive ones [14]. Unfortunately, a relational database does { Burnaby, ..., Vancouver, Victoria } ⊂ British Columbia
not explicitly store negative data, and thus no explicitly { Calgary, ..., Edmonton, Lethbridge } ⊂ Alberta
specified negative examples can be used for specializa- { Hamilton, Toronto, ..., Waterloo } ⊂ Ontario
tion. Therefore, a database induction process relies { Bombay, ..., New Delhi } ⊂ India
mainly on generalization, which should be performed { Beijing, Nanjing, ..., Shanghai } ⊂ China
cautiously to avoid over-generalization. { China, India, Germany, ..., Switzerland } ⊂ foreign
{ Alberta, British Columbia, ..., Ontario } ⊂ Canada
Many kinds of rules, such as characteristic rules, { foreign, Canada } ⊂ ANY (place)
discrimination rules, data evolution regularities, etc. can { 0.0 — 1.99 } ⊂ poor
be discovered by induction processes. A characteristic { 2.0 — 2.99 } ⊂ average
rule is an assertion which characterizes a concept { 3.0 — 3.49 } ⊂ good
satisfied by all or a majority number of the examples in { 3.5 — 4.0 } ⊂ excellent
the class undergoing learning (called the target class). { poor, average, good, excellent } ⊂ ANY (grade)
For example, the symptoms of a specific disease can be Figure 1. A concept hierarchy table of the database
summarized by a characteristic rule. A discrimination
rule is an assertion which discriminates a concept of the Example 1. The concept hierarchy table of a typical
class being learned (the target class) from other classes university database is shown in Fig. 1, where A ⊂ B indi-
(called contrasting classes). For example, to distinguish cates that B is a generalization of A . A concept tree
one disease from others, a discrimination rule should represents a taxonomy of concepts of the values in an
summarize the symptoms that discriminate this disease attribute domain. A concept tree for status is shown in
from others. Furthermore, data evolution regularity Fig. 2.
represents the characteristics of the changed data if it is a
characteristic rule, or the features which discriminate the
commonly referenced concept hierarchy is associated
ANY with an attribute as the default one. Other hierarchies can
be chosen explicitly by preferred users in the learning
process.
undergraduate graduate
2.3. Representation of learning results
From a logical point of view, each tuple in a rela-
f reshman sophomore junior senior M.A. M.S. Ph.D. tion is a logic formula in conjunctive normal form, and a
data relation is characterized by a large set of disjunc-
Figure 2. A concept tree for status. tions of such conjunctive forms. Thus, both the data for
Concept hierarchies can be provided by knowledge learning and the rules discovered can be represented in
engineers or domain experts. This is reasonable for even either relational form or first-order predicate calculus.
large databases since a concept tree registers only the dis- A relation which represents intermediate (or final)
tinct discrete attribute values or ranges of numerical learning results is called an intermediate (or a final) gen-
values for an attribute which are, in general, not very eralized relation. In a generalized relation, some or all
large and can be input by domain experts. Moreover, of its attribute values are generalized data, that is, nonleaf
many conceptual hierarchies are actually stored in the nodes in the concept hierarchies. Some learning-from-
database implicitly. For example, the information that examples algorithms require the final learned rule to be in
"Vancouver is a city of British Columbia, which, in turn, conjunctive normal form [14]. This requirement is usu-
is a province of Canada", is usually stored in the database ally unreasonable for large databases since the general-
if there are "city", "province" and "country" attributes. ized data often contain different cases. However, a rule
Such hierarchical relationships can be made explicit at containing a large number of disjuncts indicates that it is
the schema level by indicating "city ⊂ province ⊂ coun- in a complex form and further generalization should be
try". Then, the taxonomy of all the cities stored in the performed. Therefore, the final generalized relation
database can be retrieved and used in the learning pro- should be represented by either one tuple (a conjunctive
cess. rule) or a small number (usually 2 to 8) of tuples
Some concept hierarchies can be discovered corresponding to a disjunctive rule with a small number
automatically or semi-automatically. Numerical attri- of disjuncts. A system may allow a user to specify the
butes can be organized as discrete hierarchical concepts, preferred generalization threshold, a maximum number
and the hierarchies can be constructed automatically of disjuncts of the resulting formula. For example, if the
based on database statistics. For example, for an attribute threshold value is set to three, the final generalized rule
"GPA", an examination of the attribute value distribution will consist of at most three disjuncts.
in the database may disclose that GPA falls between 0 to The complexity of the rule can be controlled by the
4, and most GPA’s for graduates are clustered between 3 generalization threshold. A moderately large threshold
and 4. One may classify 0 to 1.99 into one class, and 2 to may lead to a relatively complex rule with many dis-
2.99 into another but give finer classifications for those juncts and the results may not be fully generalized. A
between 3 and 4. Even for attributes with discrete values, small threshold value leads to a simple rule with few dis-
statistical techniques can be used under certain cir- juncts. However, small threshold values may result in an
cumstances [6]. For example, if the birth-place of most overly generalized rule and some valuable information
students are clustered in Canada and scattered in many may get lost. A better method is to adjust the threshold
different countries, the highest level concepts of the attri- values within a reasonable range interactively and to
bute can be categorized into "Canada" and "foreign". select the best generalized rules by domain experts and/or
Similarly, an available concept hierarchy can be modified users.
based on database statistics. Moreover, the concept Exceptional data often occur in a large relation. It
hierarchy of an attribute can also be discovered or refined is important to consider exceptional cases when learning
based on its relationship with other attributes [6]. in databases. Statistical information helps learning-
Different concept hierarchies can be constructed on from-examples to handle exceptions and/or noisy data
the same attribute based on different viewpoints or [3, 15]. A special attribute, vote, can be added to each
preferences. For example, the birthplace could be organ- generalized relation to register the number of tuples in
ized according to administative regions such as pro- the original relation which are generalized to the current
vinces, countries, etc., geographic regions such as east- tuple in the generalized relation. The attribute vote car-
coast, west-coast, etc., or the sizes of the city, such as, ries database statistics and supports the pruning of scat-
metropolis, small-city, town, countryside, etc. Usually, a tered data and the generalization of the concepts which
take a majority of votes. The final generalized rule will specified explictly in this learning request, default hierar-
be the rule which represents the characteristics of a chies and thresholds are used.
majority number of facts in the database (called an
approximate rule) or indicates quantitative measurement 3. Principles for Attribute-Oriented Induction in
of each conjunct or disjunct in the rule (called a quantita- Relational Databases
tive rule).
3.1. Basic strategies for attribute-oriented induction
2.4. A database learning language A set of seven basic strategies are defined for per-
Given a number of examples, generalization can be forming attribute-oriented induction in relational data-
performed in many different directions [5]. Uncon- bases, which are illustrated using Example 3.
strained learning may result in a very large set of learned Example 3. Let Table 1 be the relation student in a
rules. Moreover, different rules can be extracted from the university database.
same set of data using different background knowledge
(concept hierarchies). In order to constrain a generaliza- 
    

Name  Status  Major  Birth_Place  GPA 
tion process and extract interesting rules from databases, 
     
Anderson 
 M.A.  history  Vancouver  3.5 
learning should be directed by specific learning requests 
  
Bach junior math Calgary  3.7 
and background knowledge.  Carlton  junior  liberal arts  Edmonton  2.6 

    
A database learning request should consist of (i) a 
Fraser  M.S.  physics  Ottawa  3.9 
database query which extracts the relevant set of data, (ii)  Gupta  Ph.D.  math  Bombay  3.3 

    
the kind of rules to be learned, (iii) the specification of 
Hart  sophomore  chemistry  Richmond  2.7 
the target class, and possibly, the contrasting classes  Jackson  senior  computing  Victoria  3.5 

     
depending on the rules to be learned, (iv) the preferred Liu Ph.D.

  biology  Shanghai  3.4 

...  ...  ...  ...  ... 
concept hierarchies, and (v) the preferred form to express  Meyer  sophomore   Burnaby  3.0 
music
learning results. Notice that (iv) and (v) are optional 
    

Monk  Ph.D.  computing  Victoria  3.8 
since default concept hierarchies and a default generali-  Wang   statistics 
M.S.
 Nanjing  3.2 
zation threshold can be used if no preference is specified      
Wise freshman
 literature Toronto 3.9
explicitly.
A database learning system, DBLEARN, has been Table 1. A relation Student in a university database.
proposed in our study [1]. The language of DBLEARN
can be viewed as an extension to the relational language Suppose that the learning task is to learn charac-
SQL for knowledge discovery in databases. Because of teristic rules for graduate students relevant to the attri-
limited space, we present one short illustrative example butes Name, Major, Birth_Place and GPA, using the
of a learning request specified to DBLEARN. default conceptual hierarchy presented in Fig. 1 and the
default threshold value of 3. The learning task is
Example 2. Our objective is to learn a discrimination represented in DBLEARN as follows.
rule which distinguishes Ph.D. students from M.S. stu-
in relation Student
dents in science based upon the level of courses in sci- learn characteristic rule for Status = "graduate"
ence in which they assist. The learning involves both Stu- in relevance to Name, Major, Birth_Place, GPA
dent and Course relations. The request is specified to
DBLEARN as follows. 
    

Name  Major  Birth_Place  GPA  vote 
in relation Student S, Course C 
     
learn discrimination rule for S.Status = "Ph.D." Anderson  history  Vancouver  3.5 
 1 
in contrast to S.Status = "M.S." 
Fraser  physics  Ottawa  3.9  1 
 Gupta  math  Bombay  3.3  1 
where S.Major = "science" and C.Dept = "science" 
    
and C.TA = S.Name 
Liu  biology  Shanghai  3.4  1 
 ...  ...  ...  ...  ... 
in relevance to C.Level 
     

Monk  computing  Victoria  3.8  1 
Notice that a database query is embedded in the      
Wang statistics
 Nanjing 3.2 1
learning request, and "science" is a piece of generalized
data which can be found in the concept hierarchy table. Table 2. The initial data relation for induction.
The preferred conceptual hierarchy can be specified by
"using hierarchy hierarchy_name"; and the preferred For this learning request, preprocessing is per-
generalization threshold can be specified by "using formed by first selecting graduate students. Since "gra-
threshold: threshold_value". Since neither of them is duate" is a nonleaf node in the concept hierarchy on
Status, the hierarchy table should be consulted to extract
the set of the corresponding primitive data stored in the generalization should be enforced by ascending the tree
relation, which is {M.S., M.A., Ph.D.}. Then the data one level at a time.
about graduates can be retrieved and projected on
Rationale. This strategy corresponds to the generaliza-
relevant attributes Name, Major, Birth_Place, and GPA,
tion rule, climbing generalization trees, in learning-
which results in an initial data relation on which induc-
from-examples [14]. The substitution of an attribute
tion can be performed. Table 2 reflects the result of this
value by its higher level concept makes the tuple cover
preprocessing and a special attribute vote is attached to
more cases than the original one and thus generalizes the
each tuple with its initial value set to 1. Such a prepro-
tuple. Ascending the concept tree one level at a time
cessed data relation is called an initial relation.
ensures that the generalization shall follow the least com-
Attribute-oriented induction is performed on the mitment principle and thus reduces chances of over-
initial relation. generalization.
Strategy 1. (Generalization on the smallest decompos- As a result of concept tree ascension, different
able components) Generalization should be performed tuples may generalize to an identical tuple where two
on the smallest decomposable components (or attributes) tuples are identical if they have the same corresponding
of a data relation. attribute values without considering the special attribute
vote. To incorporate quantitative information in the
Rationale. Generalization is a process of learning from
learning process, vote should be accumulated in the gen-
positive examples. Generalization on the smallest
eralized relation when merging identical tuples.
decomposable components rather than on composite
attributes ensures that the smallest possible chance is Strategy 4. (Vote propagation) The value of the vote of
considered in the generalization, which enforces the least a tuple should be carried to its generalized tuple and the
commitment principle (commitment to minimally gen- votes should be accumulated when merging identical
eralized concepts) and avoids over-generalization. tuples in generalization.
We examine the task-relevant attributes in Rationale. Based on the definition of vote, the vote of
sequence. There is no higher level concept specified on each generalized tuple must register the number of the
the first attribute Name. Thus, the attribute should be tuples in the initial data relation generalized to the
removed in generalization, which implies that general current one. Therefore, to keep the correct votes
properties of a graduate student cannot be characterized registered, the vote of each tuple should be carried in the
by the attribute Name. This notion is based on Strategy 2. generalization process, and such votes should be accumu-
Strategy 2. (Attribute removal) If there is a large set of lated when merging identical tuples.
distinct values for an attribute but there is no higher level By removing one attribute and generalizing the
concept provided for the attribute, the attribute should be three remaining ones, the relation depicted in Table 2 is
removed in the generalization process. generalized to a new relation as illustrated in Table 3.

   
Rationale. This strategy corresponds to the generaliza- 
Major  Birth_Place  GPA  vote 
tion rule, dropping conditions, in learning-from-examples 
   
art B.C.

  excellent  35 
[14]. Since an attribute-value pair represents a conjunct 
science  Ontario  excellent  10 
in the logical form of a tuple, removal of a conjunct elim-  science  B.C.  excellent  30 

   
inates a constraint and thus generalizes the rule. If there 
science  India  good  10 
    
is a large set of distinct values in an attribute but there is science China
 good 15
no higher level concept provided for it, the values cannot
Table 3. A generalized relation
be generalized using higher level concepts, thus the attri-
bute should be removed. To judge whether an attribute needs to be further
The three remaining attributes Major, Birth_place generalized, we have
and GPA can be generalized by substituting for subordi- Strategy 5. (Threshold control on each attribute) If
nate concepts by their corresponding superordinate con- the number of distinct values of an attribute in the target
cepts. For example, "physics" can be substituted for by class is larger than the generalization threshold value,
"science" and "Vancouver" by "B.C.". Such substitutions further generalization on this attribute should be per-
are performed attribute by attribute, based on Strategy 3. formed.
Strategy 3. (Concept tree ascension) If there exists a Rationale. The generalization threshold controls and
higher level concept in the concept tree for an attribute represents the maximum number of tuples of the target
value of a tuple, the substitution of the value by its higher class in the final generalized relation. If one attribute
level concept generalizes the tuple. Minimal contains more distinct values than the threshold, the
number of distinct tuples in the generalized relation must have evolved Strategy 7.
be greater than the threshold value. Thus the values in
Strategy 7. (Rule transformation) A tuple in a final
the attribute should be further generalized.
generalized relation is transformed to conjunctive normal
After attribute-oriented ascension of concept trees form, and multiple tuples are transformed to disjunctive
and the merging of identical tuples, the total number of normal form.
tuples in a generalized relation may remain greater than
Notice that simplification can be performed on
the specified threshold. In this case, further generaliza-
Table 4 by unioning the first two tuples if set representa-
tion is required. Strategy 6 has been devised for this gen-
tion of an attribute is allowed. The relation so obtained is
eralization.
shown in Table 5.
Strategy 6. (Threshold control on generalized rela-  
   
 
Major  Birth_Place  GPA  vote 
tions) If the number of tuples of a generalized relation in  
   
the target class is larger than the generalization thres- { art, science }  Canada  excellent  75
  
  
science foreign
 good 25
hold value, further generalization on the relation should
be performed. Table 5. Simplification of the generalized relation.
Rationale. Based upon the definition of the generaliza- Suppose art and science cover all of the Major
tion threshold, further generalization should be per- areas. Then {art, science} can be generalized to ANY and
formed if the number of tuples in a generliazed relation is be removed from the representation. Therefore, the final
larger than the threshold value. By further generalization generalized relation is equivalent to rule (1 ), that is, a
on selected attribute(s) and merging of identical tuples, graduate is either (with 75% probability) a Canadian
the size of the generalized relation will be reduced. Gen- with an excellent GPA or (with 25% probability) a
eralization should continue until the number of remaining foreign student, majoring in sciences with a good GPA.
tuples is no greater than the threshold value. Notice that since a characteristic rule characterizes all of
At this stage, there are usually alternative choices the data in the target class, its then-part represents the
for selecting a candidate attribute for further generaliza- necessary condition of the class.
tion. Criteria, such as the preference of a larger reduction
(1) - (x) graduate(x) → ( Birth_Place(x) ∈ Canada /\
V
ratio on the number of tuples or the number of distinct
attribute values, the simplicity of the final learned rules, GPA(x) ∈ excellent ) [75%] \/ ( Major(x) ∈ science /\
etc., can be used for selection. Interesting rules can often Birth_Place(x) ∈ foreign /\ GPA(x) ∈ good ) [25%].
be discovered by following different paths leading to
Rule (1) is a quantitative rule. It can also be
several generalized relations for examination, com-
expressed in the qualitative form by dropping the quanti-
parison and selection. Following different paths
tative measurement. Moreover, the learning result may
corresponds to the way in which different people may
also be expressed as an approximate rule by dropping the
learn differently from the same set of examples. The
conditions or conclusions with negligible probabilities.
generalized relations can be examined by users or experts
interactively to filter out trivial rules and preserve
3.2. Basic attribute-oriented induction algorithm
interesting ones [23].
The basic idea of attribute-oriented induction is
Table 3 represents a generalized relation consisting
summarized in the following algorithm.
of five tuples. Further generalization is needed to reduce
the number of tuples. Since the attribute "Birth_Place" Algorithm 1. Basic attribute-oriented induction in rela-
contains four distinct values, generalization should be tional databases.
performed on it by ascending one level in the concept
Input: (i) A relational database, (ii) the learning task, (iii)
tree, which results in the relation shown in Table 4.
the (optional) preferred concept hierarchies, and (iv)

   

Major  Birth_Place  GPA  vote  the (optional) preferred form to express learning

    
art Canada

  excellent  35 
results (e.g., generalization threshold).

science  Canada  excellent  40 
     Output. A characteristic rule learned from the database.
science foreign
 good 25
Method. Basic attribute-oriented induction consists of
Table 4. Further generalization of the relation.
the following four steps:
The final generalized relation consists of only a Step 1. Collect the task-relevant data,
small number of tuples, and this generalized relation can
be transformed into a simple logical formula. Based Step 2. Perform basic attribute-oriented induction,
upon the principles of logic and databases [8, 22], we
Step 3. Simplify the generalized relation, and 4. Learning Other Knowledge Rules by Attribute-
Oriented Induction
Step 4. Transform the final relation into a logical rule.
The attribute-oriented induction method can also
Notice that Step 2 is performed as follows.
be applied to learning other knowledge rules, such as
begin {basic attribute-oriented induction} discrimination rules, data evolution regularities, etc.
for each attribute Ai (1 ≤ i ≤ n , where n = # of attri-
butes) in the generalized relation GR do 4.1. Learning discrimination rules
while #_of_distinct_values_in_Ai > threshold do { Since a discrimination rule distinguishes the con-
if no higher level concept in the concept hierarchy cepts of the target class from those of contrasting classes,
table for Ai the generalized condition in the target class that overlaps
then remove Ai the condition in contrasting classes should be detected
else substitute for the values of the Ai ’s by its and removed from the description of discrimination rules.
corresponding minimal generalized concept; Thus, a discrimination rule can be extracted by generaliz-
merge identical tuples } ing the data in both the target class and the contrasting
while #_of_tuples in GR > threshold do { class synchronously and excluding properties that overlap
selectively generalize attributes; in both classes in the final generalized rule.
merge identical tuples } To implement this notion, the basic attribute-
end. oriented algorithm can be modified correspondingly for
discovery of discrimination rules. We illustrate the pro-
Theorem 1. Algorithm 1 learns correctly characteristic
cess with Example 4.
rules from relational databases.
Example 4. Suppose a discrimination rule is to be
Proof.
extracted to distinguish graduate students from undergra-
In Step 1, the relevant set of data in the database is duates in the relation Student (Table 1). Clearly, both the
collected for induction. The then-part in the first while- target class graduate and the contrasting class undergra-
loop of Step 2 incorporates Strategy 1 (attribute remo- duate are relevant to the learning process, and the data
val), and the else-part utilizes Strategy 3 (concept tree should be partitioned into two portions: graduate in con-
ascension). The condition for the first while-loop is trast to undergraduate. Generalization can be performed
based on Strategy 5 and that for the second one on Stra- synchronously in both classes by attribute removal and
tegy 6 (threshold control). Strategy 2 is used in Step 2 to concept tree ascension. Suppose the relation is general-
ensure that generalization is performed on the minimal ized to Table 6.
decomposable components. Each generalization state-  
  
ment in both while-loops applies the least-commitment  
Class  Major Birth_Place GPA  vote mark 
 

principle based on those strategies. Finally, Steps 3 and 4 art B.C. excellent  35 
   
apply logic transformations based on the correspondence   science Ontario excellent  10 
 graduate  science B.C. excellent  30 * 
between relational tuples and logical formulas. Thus the   science India good  10 
obtained rule should be the desired result which summar-   science China good  15 
 
  
izes the characteristics of the target class.   science Alberta excellent  15 
  art Alberta average  20 
The basic attribute-oriented induction algorithm  undergrad.  science B.C. average  60 
extracts a characteristic rule from an initial relation.   science B.C. excellent  35 * 
   
Since the generalized rule covers all of the positive art B.C. average 50
   
art Ontario excellent
 20
examples in the database, it forms the necessary condi-
tion of the learning concept, that is, the rule is in the form Table 6. A generalized relation
of
As shown in Table 6, different classes may share
learning_class(x) → condition(x),
tuples. The tuples shared by different classes are called
where condition (x ) is a formula containing x . However, overlapping tuples. Obviously, the third tuple of the
since data in other classes are not taken into considera- class graduate and the fourth tuple of the class under-
tion in the learning process, there could be data in other grad. are overlapping tuples, which indicates that a B.C.
classes which also meet the specified condition. Thus, born student, majoring in science with excellent GPA,
condition (x ) is necessary but may not be sufficient for x to may or may not be a graduate student. In order to get an
be in the learning class. effective discrimination rule, care must be taken to han-
dle the overlapping tuples. We utilize Strategy 8.
Strategy 8. (Handling overlapping tuples) If there are
overlapping tuples in both the target and contrasting target class) is the ratio of the number of original tuples
classes, these tuples should be marked and be excluded in the target class covered by q to the total number of
from the final discrimination rule. tuples in both the target class and the contrasting classes
covered by q . Formally, the d-weight of the concept q in
Rationale. Since the overlapping tuples represent the
class C j is defined as below.
same assertions in both the target class and the constrast-
K
ing class, the concept described by the overlapping tuples d_weight = votes (q ∈ C j ) / iΣ votes (q ∈ Ci ),
=1
cannot be used to distinguish the target class from the
contrasting class. By detecting and marking overlapping where K stands for the total number of the target and con-
tuples, we have the choice of including only those asser- trasting classes, and C j is in {C 1, ..., CK }.
tions which have a discriminating property in the rule,
which ensures the correctness of the learned discrimina- The range for d-weight is in the interval [0 − 1]. A
tion rule. high d-weight indicates that the concept is primarily
derived from the target class C j , and a low d-weight
After marking the third tuple in the class of gradu- implies that the concept is primarily derived from the
ate and the fourth tuple in the class of undergrad., the tar- contrasting class(es).
get class contains four unmarked tuples as shown in
Table 6, which implies that the resulting rule will contain The d-weight for the first tuple in the target class is
four disjuncts. Suppose the threshold value is 3, further 35/(35 + 20) = 63.64%, and that for the second and the
generalization is performed on attribute "Birth_Place", third tuples are 44.44% and 100%. Notice that the d-
which results in the relation shown in Table 7. weight for any unmarked tuple is 100%. The quantitative
 
  
discrimination rule for graduates can be expressed as
 
Class  Major Birth_Place GPA  vote mark  (2b ).
 
 art Canada excellent  35 * 
(2b ) V- (x) graduate(x) ←
   
 graduate  science Canada excellent  40 *  (Major(x) ∈ science /\ Birth_Place(x) ∈ foreign /\ GPA(x) ∈
 
 science foreign good  25 
  science good) [100%] \/
Canada excellent  50 * 
 undergrad.  arts Canada average  70  (Major(x) ∈ art /\ Birth_Place(x) ∈ Canada /\ GPA(x) ∈
    excellent) [63.64%] \/
  science Canada average  60
 
art Canada excellent
 20 * (Major(x) ∈ science /\ Birth_Place(x) ∈ Canada /\ GPA(x)
∈ excellent ) [44.44%].
Table 7. A generalized relation
Rule (2b ) implies that if a foreign student majors in
Notice that overlapping marks should be inherited sciences with a good GPA, (s)he is certainly a graduate;
in their generalized tuples because their generalized con- if a Canadian student majors in art with excellent GPA,
cepts still overlap with that in the contrasting class. the probability for him (her) to be a graduate is 63.64%;
Moreover, since generalization may produce new over- and if a Canadian student majors in science with excel-
lapping tuples, an overlapping check should be per- lent GPA, the probability for him (her) to be a graduate is
formed at each ascension of the concept tree. The gen- 44.44%.
eralization process repeats until the number of unmarked
A qualitative discrimination rule provides a
tuples in the target class is below the specified threshold
sufficient condition but not a necessary one for an object
value.
(or a tuple) to be in the target class. Although those
Since Table 7 has one unmarked tuple and two tuples which meet the condition are in the target class,
marked tuples in the target class, the qualitative discrimi- those in the target class may not necessarily satisfy the
nation rule should contain only the unmarked tuple, as condition since the rule may not cover all of the positive
shown in rule (2a ). examples of the target class in the database. Thus, the
(2a ) V- (x) graduate(x) ← rule should be presented in the form of
Major(x) ∈ science /\ Birth_Place(x) ∈ foreign /\ GPA(x) ∈ learning_class(x) ← condition(x).
good.
Rule (2a ) is a qualitative rule which excludes over- A quantitative discrimination rule presents the
lapping disjuncts. In many cases, however, it is informa- quantitative measurement of the properties in the target
tive to derive a quantitative rule from the final general- class versus that in the contrasting classes. A 100% d-
ized relation, which associates with each disjunct a quan- weight indicates that the generalized tuple is in the target
titative measurement (called d-weight) to indicate its class only. Otherwise, the d-weight shows the possibli-
discriminating ability. ties for a generalized tuple to be in the target class.

Definition. Let q be a generalized concept (tuple) and C j


be the target class. The d-weight for q (referring to the
4.2. Learning data evolution regularities relevant data (usually, the evolving portion) in different
Data evolution regularity reflects the trend of database instances and performing attribute-oriented
changes in a database over time. Discovery of regulari- induction on the corresponding task-relevant data sets.
ties in an evolving database is important for many appli-
cations. To simplify our discussion, we assume that the 5. Towards Knowledge Discovery in Extended-
database schema remains stable in data evolution. A Relational Databases
database instance, DBt , is the database state, i.e., all of The relational data model has been extended in
the data in the database, at time t . At least two different many ways to meet the requirements of new database
database instances, DBt and DBt where t 1 ≠ t 2, are
1 2
applications [20]. Nested-relational and deductive data-
required for such knowledge discovery. Our discussion bases are two influential extended-relational systems.
can be generalized to multiple (> 2) database instances. Interestingly, attribute-oriented induction can be easily
extended to knowledge discovery in these systems.
Data evolution regularities can be classified into
characteristic rules and discrimination rules. The former The nested relational model allows nonatomic,
rules summarize characteristics of the changed data; relational-valued domains. Thus, a hierarchy of nested
while the latter distinguish general charteraristics of the relations is formed [18]. The attribute-oriented induction
relevant data in the current database from those in a pre- can be performed easily on the atomic domains in the
vious database instance. We show that attribute-oriented same way as in relational systems. The induction can be
induction can be used in the generalization process. performed on the non-atomic domains in two different
ways: (1) unnesting the nested relations, which
Example 5. Let the learning request be to find the transforms nested relations into flat relations to which the
characteristics of those graduate students whose GPA previously described method can be applied; or (2) per-
increases at least 0.5 in the last six months. forming induction directly on the nested relations by
The knowledge discovery process can be parti- treating the nonatomic domains as set-valued domains.
tioned into two phases: (1) collecting task-relevant data, Example 6. Let the student relation in the university
and (2) performing attribute-oriented induction on the database contain one more attribute hobby, which regis-
relevant data. The first phase is performed by finding all ters a set of hobbies for each student. We study the learn-
of the graduate students whose GPA increases at least 0.5 ing request: discover the relationship between GPA and
in the last six months based upon the current database hobby for graduate students in computing science.
instance and the instance six months ago. Since graduate
is a nonprimitive concept, data retrieval should be per- Although generalization can be performed by first
formed by consulting the concept hierarchy table as well. flattening the (nested) relation, Student, into a nonnested
The second phase is carried out in the same manner as the relation and then performing attribut-oriented induction,
previously studied attribute-oriented induction for learn- it is more efficient and elegant to perform induction
ing characteristic rules. directly on the attribute hobby by treating the domain as a
set-valued domain. For example, without flattening the
Suppose that another learning request is to distin- relation, the set {badminton, violin} can be generalized to
guish the characteristics of the undergraduate students a new set {sports, music}. A benefit of direct induction
enrolled in July 1990 from those enrolled in July 1992. on set-valued attribute is that, unlike unnesting, it does
The knowledge discovery process can still be parti- not flatten an original tuple into several tuples. Vote
tioned into two phases: (1) collecting the task-relevant accumulation can be handled directly when merging
data and grouping them into two classes, the target class identical tuples in the concept tree ascension since one
and the contrasting class, and (2) performing attribute- generalized tuple corresponds to one original tuple in the
oriented induction synchronously on the two classes. The generalized relation.
first phase is performed by finding all of the undergradu- Deductive database systems can be viewed as
ate students enrolled in July 1990 and those enrolled in another extension to relational systems by introducing
July 1992 and grouping them into two classes respec- deduction power to relational databases. By incorporat-
tively. The second phase is the same as the previously ing deduction rules and integrity constraints in the gen-
studied attribute-oriented induction for learning discrimi- eralization process, attribute-oriented induction can be
nation rules. performed on deductive databases as well. There are two
Such a process can also be used to study the gen- cases to be considered for knowledge extraction in
eral characteristics of the newly inserted or newly deleted deductive databases: (1) discovery of knowledge rules on
data sets in a database. In general, data evolution regu- the relations defined by deduction rules, and (2)
larities can be extracted by collecting the learning task- discovery of new knowledge rules from the existing
rules.
In the first case, deduction can be performed by [7, 12], most of which are based on the previously
executing a deductive query which retrieves task-relevant developed learning algorithms. The major difference of
data from databases. Then induction can be performed our approach from others is attribute-oriented vs. tuple-
on the deduced data set. oriented induction. It is essential to compare these two
approaches.
Example 7. Let a deductive database be constructed
from the relational database university, with award can- Both tuple-oriented and attribute-oriented induc-
didate defined by a set of deduction rules {(3a ), (3b )}. tion take attribute removal and concept tree ascension as
their major generalization techniques. However, the
(3a ) award_candidate (X ) ←
former technique performs generalization tuple by tuple,
status (X ) = graduate , gpa (X ) ≥ 3.75.
while the latter, attribute by attribute. The two
(3b ) award_candidate (X ) ← approaches involves significantly different search spaces.
status (X ) = undergraduate , gpa (X ) ≥ 3.5. Among many learning-from-examples algorithms, we use
the candidate elimination algorithm [16] as an example
Suppose that the learning request is to find general
to demonstrate such a difference.
characteristics of award candidates in the computing sci-
ence department. In a manner similar to the previously In the candidate elimination algorithm, the set of
described process, the knowledge discovery process can all of the concepts which are consistent with training
be partitioned into two phases: (1) collecting the task- examples is called the version space of the training
relevant data, and (2) performing attribute-oriented examples. The learning process is the search in the ver-
induction. The first phase is performed by finding all of sion space to induce a generalized concept which is
the award candidates in the computing science depart- satisfied by all of the positive examples and none of the
ment. This corresponds to a dedutive query which can be negative examples.
processed using deductive database technology [8, 22]. graduate /\science
The second phase is to extract knowledge from those can-
didates, which is the same as attribute-oriented induction
for learning characteristic rules.
graduate /\mathM.S. /\science Ph.D. /\science graduate /\physics
When deduction rules involve recursion, the deduc-
tion process itself could be rather complex [22]. How-
ever, the knowledge discovery process is still carried out
by performing first deduction then induction. Notice that M.S. /\math Ph.D. /\math M.S. /\physics Ph.D. /\physics
deduction may involve consulting concept hierarchies if
a deduction rule is defined using higher-level concepts. (a) The entire version space.
As to the second case, discovering new rules from graduate science
the existing ones, an existing rule could be an induced
rule or a partially induced one. If such a rule represents a
generalization of all of the relevant extensional data,
further induction can be performed directly on the rule M.S. Ph.D. math physics
itself. If such a rule, together with some extensional data, (b) The factored version spaces.
determine all of the relevant information in the deductive Figure 3. Entire vs. factored version spaces.
database, knowledge discovery should be performed on
such hybrid data by first performing induction on the Since generalization in an attribute-oriented
extensional data to generalize the data to the same level approach is performed on individual attributes, the con-
of the concepts as that in the existing rule, and the pro- cept hierarchy of each attribute can be treated as a fac-
vided rules are treated as part of the intermediately gen- tored version space. Factoring the version space may
eralized relation and are merged with the generalized significantly improve the computational efficiency. Sup-
relation. The merged relation can then be further gen- pose there are p nodes in each concept tree and there are
eralized by attribute-oriented induction. k concept trees (attributes) in the relation, the total size
of k factorized version spaces is p × k . However, the
6. A Comparison with Other Learning Algorithms size of the unfactorized version space for the same con-
k
Attribute-oriented induction provides a simple and cept tree should be p [21]. This can be verified from
efficient way to learn different kinds of knowledge rules Fig. 3. Suppose the concept hierarchy is specified as:
in relational and extended relational databases. As a {math, physics} ⊂ science, and {M.S., Ph.D.} ⊂ gradu-
newly emerging field, there have been only a few ate. The corresponding entire version space and factored
database-oriented knowledge discovery systems reported version space are Fig. 3 (a) and Fig. 3 (b), respectively.
2
The entire version space contains 3 = 9 nodes, but the induction can learn disjuctive rules and handle excep-
factored version spaces contain only 3 × 2 = 6 nodes. tional cases elegantly by incorporating statistical tech-
Similar arguments hold for other tuple-oriented niques in the learning process. Moreover, when a new
learning algorithms. Although different algorithms may tuple is inserted into a database relation, rather than res-
adopt different search strategies, the tuple-oriented tarting the learning process from the beginning, it is
approach examines the training examples one at a time to preferable to amend and fortify what was learned from
induce generalized concepts. In order to discover the the previous data. Our algorithms can be easily extended
most specific concept that is satisfied by all of the train- to faciliate such incremental learning [15]. Let the gen-
ing examples, the algorithm must search every node in eralized relation be stored in the database. When a new
the search space which represents the possible concepts tuple is inserted into a database, the concepts of the new
derived from the generalization on this training example. tuple are first generalized to the level of the concepts in
Since different attributes of a tuple may be generalized to the generalized relation. Then the generalized tuple can
different levels, the number of nodes to be searched for a be naturally merged into the generalized relation.
training example may involve a huge number of possible In our previous discussion, we assume that every
combinations. concept hierarchy is organized as a balanced tree, and the
On the other hand, an attribute-oriented algorithm primitive concepts in the database reside at the leaves of
performs generalization on each attribute uniformly for the tree. Hence generalization can be performed syn-
all the tuples in the data relation at the early generaliza- chronously on each attribute, which generalizes the attri-
tion stages. It essentially considers only the factored ver- bute values at the same lower level to the ones at the
sion space. An algorithm which explores different possi- same higher level. By minor modification to the basic
ble combinations for a large number of tuples during such algorithm, induction can also be performed successfully
a generalization process will not be productive since such on unbalanced concept trees and on data residing at dif-
combinations will be merged in further generalizations. ferent levels of concept trees. In such cases, rather than
Different possible combinations should be explored only simply performing generalization on every branch of the
when the relation has been generalized to a relatively tree, we check whether a generalized concept may cover
small intermediate generalized relation, which other concepts of the same attribute. If the generalized
corresponds to Strategy 6 in the basic attribute-oriented concept covers a concept several levels down the concept
induction. Notice that only one rule formation technique tree, the covered concept is then replaced by the general-
is provided in our basic algorithm. However, the rela- ized concept, that is, ascending the tree several levels at
tively small intermediate generalized relation obtained once. By doing so, concepts at different levels can be
by attribute-oriented induction can be treated as a spring- handled correctly and efficiently.
board on which different knowledge extarction tech- As another variation, our method can also handle
niques can be explored to form new rules. the concepts organized as lattices. If concepts are organ-
Another obvious advantage of our approach over ized as a lattice, some single concept may be generalized
many other learning algorithms is our integration of the to more than one concept. These generalized concepts
learning process with database operations. Relational are put into intermediate generalized relations upon
systems store a large amount of information in a struc- which further generalizations are performed as discussed.
tured and organized manner and are implemented by As a consequence, the size of intermediate generalized
well-developed storage and accessing techniques [20]. In relations may increase at some stage in the generalization
contrast to most existing learning algorithms which do process because of the effect of a lattice. However, since
not take full advantages of these database facilities the generalization is controlled by a generalization thres-
[5, 12, 15], our approach primarily adopts relational hold, the intermediate generalized relation will eventu-
operations, such as selection, join, projection (extracting ally shrink in subsequent generalizations.
relevant data and removing attributes), tuple substitution Furthermore, data sampling and parallelism can be
(ascending concept trees), and intersection (discovering explored in knowledge discovery. Attribute-oriented
common tuples among classes). Since relational opera- induction can be performed by sampling a subset of data
tions are set-oriented and have been implemented from a huge set of relevant data or by first performing
efficiently in many existing systems, our approach is not induction in parallel on several partitions of the relevant
only efficient but easily exported to many relational sys- data set and then merging the generalized results.
tems.
Our approach has absorbed many advanced 7. Application of Discovered Rules
features of recently developed learning algorithms Knowledge discovery in databases initiates a new
[3, 13]. As shown in our study, attribute-oriented frontier for querying database knowledge, cooperative
query answering and semantic query optimization. Lots Example 9. Let the knowledge rules discovered in our
can be explored using meta-data (such as concept hierar- previous examples be stored in the database, and consider
chies) and discovered knowledge. We present a few the query: find all the foreign students (born outside of
ideas for this below. Canada) majoring in science with GPA between 3.2 to
Database knowledge represents the semantic infor- 3.4.
mation associated with databases, which includes deduc- select Name
tion rules, integrity constraints, concept hierarchies about from Student
data and general data characteristics [17]. Cooperative where Major = "science" and Birth_Place != "canada"
and GPA >= 3.2 and GPA <= 3.4
query answering consists of analyzing the intent of query
and providing generalized, neighborhood or associated According to rule (2a ), all of the foreign students
information relevant to the query [4]. Semantic query majoring in science with a good GPA must be graduate
optimization applies database semantics, integrity con- students. Since the condition of rule (2a ) covers what is
straints and knowledge rules to optimize queries for inquired, the search should be performed on graduate
efficient processing [2]. students only, that is, the condition, Status = "graduate",
Previous studies on querying database knowledge can be appended to the query. Therefore, the query can
and intelligent query answering [17, 19] were focused on be optimized if the data is grouped or partitioned accord-
rules and integrity constraints in relational or deductive ing to the status of the students.
databases. With the availability of knowledge discovery
tools, it is straightforward to query general data charac- 8. Conclusions
teristics and utilize induced rules and concept hierarchies. We investigated an attribute-oriented approach for
Such queries can be answered by retrieving discovered knowledge discovery in databases. Our approach applies
rules (if such pieces of information are available) or an attribute-oriented concept tree ascension technique in
triggering a discovery process. Moreover, some queries generalization which integrates the machine learning
can be rewritten based on the analysis of the concept methodology with set-oriented database operations and
hierarchies and/or answered cooperatively using general- extracts generalized data from actual data in databases.
ized rules. Our method substantially reduces the computational
Example 8. To describe the characteristics of the gradu- complexity of the database learning processes. Different
ate students in cs (computing science) who were born in knowledge rules, including characteristic rules, discrimi-
Canada with excellent academic performance, the query nation rules, quantitative rules, and data evolution regu-
can be formulated using a syntax similar to the larities can be discovered efficiently using the attribute-
knowledge queries in [17] as follows. oriented approach. In addition to learning in relational
databases, the approach can be applied to knowledge
describe Student
where Status = "graduate" and Major = "cs" and discovery in nexted-relational and deductive databases.
Birth_Place = "Canada" and GPA = "excellent" The discovered rules can be used in intelligent query
answering and semantic query optimization.
Notice that "graduate", "Canada" and "excellent" Based upon the attribute-oriented induction tech-
are not stored as primitive data in the Student relation. nique, a prototyped experimental database learning sys-
However, using the concept hierarchy table, the query tem, DBLEARN, has been constructed. The system is
can be reformulated by substituting for "graduate" by implemented in C with the assistance of UNIX software
"{M.S., M.A. Ph.D}", etc. Then the rewritten query can packages LEX and YACC (for compiling the DBLEARN
be answered by directly retrieving the discovered rule, if language interface) and operates in conjunction with
it is stored, or performing induction on the relevant data either the Oracle or SyBase DBMS software. A database
set. learning language for DBLEARN is specified in an
It is often useful to store some intermediate gen- extended BNF grammar. The system, DBLEARN, takes
eralized relations (based upon the frequency of the a learning request as input, applies the knowledge
encountered queries) to facilitate querying database discovery algorithm(s) developed in this paper on the
knowledge. When a knowledge query is submitted to the data stored in the database, with the assistance of the con-
system, selection and further generalization can be per- ceptual hierarchy information stored in the conceptual
formed on such an intermediate generalized relation hierarchy base. Knowledge rules extracted from the
instead of on primitive data in the database. database are the output of the learning system. Our prim-
Moreover, semantic query optimization can be per- itive experimentation on medium sized data sets demon-
formed on queries using database semantics, concept strates the effectiveness and efficiency of our methods in
hierarchies and the discovered knowledge rules. knoweldge discovery in relational databases.
Attribute-oriented induction represents a promising Databases, IEEE Trans. Knowledge and Data
direction to follow in the development of efficient learn- Engineering, 4(3), 1992.
ing algorithms for knowledge discovery in databases. 11. D. Haussler, Bias, Version Spaces and Valiant’s
There are many issues which should be studied further. Learning Framework, Proc. 4th Int. Workshop on
The automatic discovery of concept hierarchies in data- Machine Learning, Irvine, CA, 1987, 324−336.
bases, the construction of interactive learning systems,
the developement of efficient induction algorithms for 12. K. A. Kaufman, R. S. Michalski and L. Kerschberg,
object-oriented databases, and the integration of Mining for Knowledge in Databases: Goals and
attribute-oriented induction with other learning para- General Description of the INLEN System, in G.
digms are interesting topics for future research. Piatetsky-Shapiro and W. J. Frawley (eds.),
Knowledge Discovery in Databases, AAAI/MIT
References Press, 1991, 449−462.
1. Y. Cai, N. Cercone and J. Han, Attribute-Oriented 13. M. V. Manago and Y. Kodratoff, Noise and
Induction in Relational Databases, in G. Piatetsky- Knowledge Acquisition, Proc. 10th Int. Joint Conf.
Shapiro and W. J. Frawley (eds.), Knowledge Artificial Intelligence, Milan, Italy, 1987, 348−354.
Discovery in Databases, AAAI/MIT Press, 1991,
14. R. S. Michalski, A Theory and Methodology of
213−228.
Inductive Learning, in Michalski et. al. (eds.),
2. U. S. Chakravarthy, J. Grant and J. Minker, Logic- Machine Learning: An Artificial Intelligence
Based Approach to Semantic Query Optimization, Approach, Vol. 1, Morgan Kaufmann, 1983,
ACM Trans. Database Syst., 15(2), 1990, 162−207. 83−134.
3. K. C. C. Chan and A. K. C. Wong, A Statistical 15. R. S. Michalski, J. G. Carbonell and T. M. Mitchell,
Technique for Extracting Classificatory Knowledge Machine Learning, An Artificial Intelligence
from Databases, in G. Piatetsky-Shapiro and W. J. Approach, Vol. 2, Morgan Kaufmann, 1986.
Frawley (eds.), Knowledge Discovery in
16. T. M. Mitchell, Generalization as Search, Artificial
Databases, AAAI/MIT Press, 1991, 107−124.
Intelligence, 18, 1982, 203−226.
4. F. Cuppens and R. Demolombe, Cooperative
17. A. Motro and Q. Yuan, Querying Database
Answering: A Methodology to Provide Intelligent
Knowledge, Proc. 1990 ACM-SIGMOD Conf.
Access to Databases, Proc. 2nd Int. Conf. Expert
Management of Data, Atlantic City, NJ, June 1990,
Database Systems, Fairfax, VA, April 1988,
173−183.
621−643.
18. M. A. Roth, H. F. Korth and A. Silberschatz,
5. T. G. Dietterich and R. S. Michalski, A
Extended Algebra and Calculus for Nested
Comparative Review of Selected Methods for
Relational Databases, ACM Trans. Database Syst.,
Learning from Examples, in Michalski et. al. (eds.),
13(4), 1988, 389−417.
Machine Learning: An Artificial Intelligence
Approach, Vol. 1, Morgan Kaufmann, 1983, 41−82. 19. C. D. Shum and R. Muntz, Implicit Representation
for Extensional Answers, Proc. 2nd Int. Conf.
6. D. Fisher, Improving Inference Through
Expert Database Systems, Vienna, VA, April 1988,
Conceptual Clustering, Proc. 1987 AAAI Conf.,
497−522.
Seattle, Washington, July 1987, 461−465.
20. A. Silberschatz, M. Stonebraker and J. D. Ullman,
7. W. J. Frawley, G. Piatetsky-Shapiro and C. J.
Database Systems: Achievements and
Matheus, Knowledge Discovery in Databases: An
Opportunities, Comm. ACM, 34(10), 1991, 94−109.
Overview, in G. Piatetsky-Shapiro and W. J.
Frawley (eds.), Knowledge Discovery in 21. D. Subramanian and J. Feigenbaum, Factorization
Databases, AAAI/MIT Press, 1991, 1−27. in Experiment Generation, Proc. 1986 AAAI Conf.,
Philadelphia, PA, August 1986, 518−522.
8. H. Gallaire, J. Minker and J. Nicolas, Logic and
Databases: A Deductive Approach, ACM Comput. 22. J. D. Ullman, Principles of Database and
Surv., 16(2), 1984, 153−185. Knowledge-Base Systems, Vols. 1 & 2, Computer
Science Press, 1989.
9. M. Genesereth and N. Nilsson, Logical
Foundations of Artificial Intelligence, Morgan 23. J. Zytkow and J. Baker, Interactive Mining of
Kaufmann, 1987. Regularities in Databases, in G. Piatetsky-Shapiro
and W. J. Frawley (eds.), Knowledge Discovery in
10. J. Han, Y. Cai and N. Cercone, Data-Driven
Databases, AAAI/MIT Press, 1991, 31−54.
Discovery of Quantitative Rules in Relational

You might also like