GRAPH-BASED KNOWLEDGE REPRESENTATION FOR GIS DATA
Manuel Pech Palacio1, David Sol1, Jesús González2
{sp205175, sol}@mail.udlap.mx,
[email protected]
1
Universidad de las Américas-Puebla
2
Instituto Nacional de Astrofísica Óptica y Electrónica
Puebla, México
Abstract
This paper presents a proposal to create a graph
representation for GIS, using both spatial and non-spatial
data and also including spatial relations between spatial
objects. Because graphs are a powerful and flexible
knowledge representation we will be able to combine
spatial and non-spatial data at the same time and this is
one of the strengths of the proposal. We hope to apply
this knowledge representation to the data mining process
with GIS data including three types of spatial relations:
topological, orientation and distance.
1. Introduction
In the last years the human capabilities in generating
and collecting data have been increasingly widespread.
The explosive growth in data and databases has created a
need for techniques and tools that can transform the data
into useful information and knowledge. In the beginning,
the goals of these techniques and tools were to discover
knowledge that could exist in relational data. Nowadays,
with the growth of the applications that deal with
georeference data, an important increase is noticed in the
management and analysis of spatial data.
Spatial data has many characteristics that distinguish it
from relational data. For example, it has topological,
distance, and direction information organized by
multidimensional spatial indexed structures. Another
difference is the query language that is used to access
spatial data. The complexity of the spatial data type is
another important feature.
Different approaches have been developed for
knowledge discovery from spatial data, next we briefly
present some of them:
Generalization [22][14]. Data and objects often
contain detailed information at primitive concept levels. It
is often desirable to summarize a large set of data and
present it at a high concept level. It assumes the existence
of background knowledge in the form of concept
hierarchies. In the case of a spatial database, there can be
two kinds of concept hierarchies, thematic and spatial. Lu
et al. [22] extended attribute-oriented induction to spatial
databases and presented two algorithms, spatial data
dominant and non-spatial data dominant generalizations.
Clustering [16][23][28][26] can be defined as the
process of grouping physical or abstract objects into
classes of similar objects. Spatial data clustering identifies
clusters, or densely populated regions, according to some
measurement in a large, multidimensional data set.
In many situations it is desirable to explore spatial
associations [19][11] to discover rules which associate
one or more spatial objects with other spatial objects.
There are various kinds of spatial predicates that could
constitute a spatial association rule. Examples include
topological relations like intersects, overlap, disjoint;
spatial orientations like left_of, west_of; and distance
information such as close_to, or far_away.
Approximation and aggregation [17]. Clustering
approaches try to answer questions like where the clusters
in the spatial database can be located. Another problem is
to find out why the clusters are there. We can rephrase the
question to ask about the characteristics of the clusters in
terms of the objects that are close to them. We need to
analyze the objects in the cluster and the objects close to
them.
Finally we have three other methods to discover
knowledge in datasets:
x
x
x
Mining an image database [12][11] can be
viewed as another approach of spatial data
mining.
Classification learning [20] is the task of
assigning an object to a class from a given set
of classes based on the attribute values of the
object.
Spatial Trend Detection [9] can describe a
regular change of one or more non-spatial
attributes of an object that changes its
position in time.
The remainder of this paper is organized as follows:
Sections 2 and 3 present basic topics about spatial and
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
non-spatial data mining. Section 4 describes three types
of spatial relations between spatial objects. The Subdue
system is described in section 5. In section 6 we present
our graph-based knowledge representation for GIS data.
Section 7 shows our conclusions.
2. Spatial data mining
Spatial data describes information about the space
occupied by objects. Spatial data is continuously obtained
by diverse types of applications such as GIS’s, medical
applications
and
computerized
cartography.
Consequently, data analysis by manual techniques is
sometimes a hard task, due to the large volume of data as
well as its complexity. To deal with the problem, different
methods have been proposed and applied to discover
knowledge in spatial data. These methods have been
implemented using techniques from different fields like
machine learning, database technology and statistics.
Spatial data mining is defined as the discovery of
implicit and previously unknown knowledge in spatial
databases [13]. Representative characteristics, structures
or clusters, and spatial associations are examples of
knowledge discovered from spatial data.
Geographic data in general consists of thematic and
spatial data [1]. Thematic data is alphanumeric data
related to spatial objects. Spatial data, on the other hand,
is described using two different properties: geometry and
topology. According to [1], spatial location and size are
considered geometric properties, whereas adjacency
(object A is located to the right of object B) and inclusion
(object A is included in object B) are considered
topological properties. The methods to discover
knowledge can be focused either on thematic or spatial
properties of spatial objects in a spatial database, or both.
3. Non-spatial data mining
Data mining can be seen as the search for hidden
patterns that may exist in databases [23]. The explosive
growth in the generation of data and its collection in
databases have created a need for techniques and tools
that can extract useful information and knowledge from
it.
Some of the data mining techniques apply to structural
data and others to non-structural data. Structural data is
defined as data that describes the relations among the
objects described in the data. We can see the data objects
as variables in the attribute-value representation, but now
we also have relations among those variables.
4. Spatial relations
In [8], Martin Ester et al. introduce three types of
spatial relations: topological, distance and direction
relations. They are called binary relations since we can
determine spatial relations between pairs of objects.
The authors define topological relations as those which
are invariant under topological transformations. If both
objects are rotated, translated or scaled simultaneously the
relations are preserved. They present a definition of
topological relations derived from the nine intersections
model [5][6][7]. The topological relations between two
objects are: disjoint, meets, overlaps, equal, cover,
covered-by, contains, and inside as we show in figure 1.
Each element in the figure describes a different
topological spatial relation.
A disjoint B
A meet B
A contains B
A inside B
A equals B
A covers B
A coveredBy B
A overlaps B
Figure 1 Topological relations
The second type of relation refers to distance relations.
These relations compare the distance between two objects
with a given constant using arithmetic operators like <,>,
and =. The distance between two objects is defined as the
minimum distance between them (see figure 2).
Knowledge discovery in databases refers to the task of
finding interesting knowledge, regularities, or high-level
information from datasets, which can then be analyzed
from different angles. People working in many different
fields including database systems, knowledge-base
systems, artificial intelligence, machine learning and
statistics have shown great interest in data mining.
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
A close to B
A far from B
Figure 2 Distance relations
The authors define a direction relation A R B of two
spatial objects using one representative point of the object
A and all points of the destination object B. It is possible
to define several possibilities of direction relations
depending on the number of points that are considered in
the source and the destination objects. The representative
point of a source object may be the center of the object or
a point on its boundary. The representative point is used
as the origin of a virtual coordinate system and its
quadrants define the directions. Examples are shown in
figure 3. For instance, object D is to the south of object C
and to the east of object A.
B
(maximal straight-line segments) and vertices (endpoints
of the edges). Each point on an edge is equidistant from
exactly two sites, and each vertex is equidistant from at
least three as we can see in figure 4. This polygonal
partition of the plane is called the Voronoi diagram.
B north A
Figure 4 Voronoi diagram
C
C northeast A
5. The Subdue system
D
A
D east A
D south C
A west D
Figure 3 Direction relations
Voronoi diagram
The Voronoi diagram (figure 4) is considered one of
the fundamental data structures in computational
geometry. Given some number of points in the plane,
their Voronoi diagram divides the plane according to the
nearest-neighbor rule. This rule states that each point is
associated with the region of the plane that is closer to it.
A definition of a Voronoi diagram [2] can be stated as
follows: Lets S denote a set of n points in the plane. For
two distinct sites p, q S, the dominance of p over q is
defined as the sub set of the plane being at least as close
to p as to q:
dom( p, q)
^x R
2
| G ( x, p) d G ( x, q )`
The Euclidian distance function is denoted by G.
dom(p,q) is a closed half plane bounded by the
perpendicular bisector of p and q. The function of the
bisector is to separate all the points of the plane closer to
p from those that are closer to q. This is also known as the
separator of p and q. The region of a site p S is the
portion of the plane lying in all of the dominances of p
over the remaining sites in S.
The Subdue system [15][27] (developed at the
University of Texas at Arlington) is a general data mining
tool that can be applied to any domain that can be
represented as a graph. It discovers substructures that
compress the original database and finds interesting
structural concepts from data. A substructure is a
connected subgraph within the graph. By replacing
previously-discovered substructures in the data, multiple
passes of Subdue produce a hierarchical description of the
structural regularities in the data. Subdue has the
capability to use a constrained inexact graph match that
can consider similar, but not identical, instances of a
substructure as a match. Subdue uses the minimum
description length principle (MDL) to guide the search
towards more appropriate substructures.
The Subdue system uses a graph-based representation.
Objects in the data (concepts) become vertices or small
sub-graphs in the graph, and relationships between
objects become directed or undirected edges in the graph.
A substructure is a connected sub-graph within the graph.
This graph representation serves as input to the Subdue
system. Figure 5 shows an example of an input database
and its graph representation. The example is presented in
terms of the house domain, where a house is defined as a
triangle on a square. T represents a triangle, S a square, C
a circle and R a rectangle. The objects in the figure (i.e.
T1, S1, R1) become labeled vertices in the graph, and the
relationships (i.e on(S1, R1), shape(C1, circle)) become
labeled edges. The graph representation of the
substructure discovered by Subdue from this data is
shown in figure 6 where Subdue found four instances of
triangle on a square.
The half planes created are convex polygons. The
boundary of a region consists of at most n–1 edges
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
Input Database
Input Graph
S1
T1
S1
on
C1
object
R1
T2
T3
T4
S2
S3
S4
shape
triangle
shape
square
on
object
on
object
shape
object
circle
on
on
object
shape
rectangle
on
on
on
object
shape
triangle
shape
square
on
object
S1
shape
triangle
shape
square
on
object
shape
object
circle
on
object
shape
triangle
shape
square
on
object
object
shape
rectangle
on
on
S1
S1
Figure 7 Graph representation of the house domain
after the substructure replacement
Figure 5 Graph representation of the house domain
6. Graph-based knowledge representation
Substructure
s hape
Instance 1
triangle
object
on
s hape
object
square
Instance 2
T1
T2
S1
S2
Instance 3
Instance 4
T3
T4
S3
S4
Figure 6 Substructure and instance discovered from
the house domain by Subdue
An instance of a substructure in an input graph
consists of a set of vertices and edges from the input
graph that match the graphical definition of the
substructure. A neighboring edge of a substructure
instance is an edge in the input graph that is not contained
in the instance, but is connected to at least one vertex in
the instance. An external connection of an instance of a
substructure is a neighboring edge of the instance that is
connected to at least one vertex not contained in the
instance.
After a substructure is discovered, each instance of the
substructure in the input graph is replaced by a single
vertex representing the entire substructure as we show in
figure 7, where the substructure discovered by Subdue
(object shape triangle on object shape square) was
labeled as S1. Subdue continues the search for the best
substructure until all possible substructures have been
considered or the amount of computation exceeds a given
limit.
In our previous work [24][25] we applied the Subdue
System to non-spatial data using a spatial dominant
approach. Now we propose a knowledge representation to
model GIS data using graphs.
Our idea is to create a graph-based model to represent
spatial and non-spatial data and use the model for
generating a dataset composed of both type of data, so we
can apply a data mining technique (i.e. Subdue system)
using this knowledge representation to spatial and nonspatial data at the same time and get enriched results
(patterns found through data mining) considering both
kind of data about objects and the spatial relations among
them.
In order to enrich the spatial data mining process it is
advisable to take into account all of the elements which
are used in a geographic representation (i.e. spatial
objects, descriptive attributes and the relationships
between them). These relationships without a doubt
enrich the spatial analysis processes. For example, we
could find out the most important characteristics of the
geometric objects located at some distance from a
particular point; identify the representative pattern of
houses located along the boundaries of a highway which
crosses some region of the state of Puebla. Another
example of the application of this technology is in the risk
zones near the Popocatépetls volcano; in this case it
would be important to know the characteristics of the
evacuation routes which would be used in situations of
volcanic activity, i.e., what are the soil characteristics of
the evacuation routes; could they withstand the
atmospheric conditions and the passage of vehicles in an
emergency situation?
As we have seen, an important characteristic of spatial
data is that the attributes of the neighbors of a specific
object may have an influence on the object itself. Three
types of spatial relations will be taken into account for the
data mining tasks in the model: topological, orientation,
and distance relations.
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
We initially use the 4-intersection model to describe
the topological relations. In our future work we plan to
use the 9-intersection model for the topological relations.
In these models, the topological relations between two
objects A and B are defined in terms of the intersections
of object A’s interior (Aº), object A’s boundary (A) and
object A’s exterior (A¯) with object B’s interior (Bº),
object B’s boundary (B) and object B’s exterior (B¯).
The exterior of an object is represented by its
complement.
In figure 8 we present three different disjoint relations
between object A and object B, however, they have the
same 9-intersection matrix due to the infinite
complements of the objects. We can see in their
intersection matrixes that the complements could not play
roles in distinguishing the disjoint relations.
Object B
Object A
Object B
Object A
Object B
i
Ø
Ø
-Ø
-Ø
-Ø
-Ø
ii
Int
Bou
Ext
Ext
Ø
Ø
-Ø
Bou
Int
Bou
Ext
Int
-Ø
-Ø
-Ø
Ext
Ø
Ø
-Ø
Bou
Ext
Ø
Ø
-Ø
Int
Bou
Int
Bou
Ext
Int
Object C
Ø
Ø
-Ø
Ø
Ø
-Ø
-Ø
-Ø
-Ø
Ext
Int
Bou
Ext
Bou
-Ø
-Ø
-Ø
Ø
Ø
Ø
Ø
Ø
Ø
Ø
Ø
-Ø
Voronoi
9-intersection
Figure 9 Distinguishing disjoint relations with V9I [3]
As we have mentioned, we initially use the 4intersection model to describe the topological relations,
and we plane in a second phase to use the 9-intersection
model. Once we have defined the basic graph-based
representation we will extend it to use the Voronoi-based
9-intersection.
In the graph-based model the spatial data will be
represented by vertices and edges. Vertices will be used
to represent the spatial objects and their attributes (data
describing the objects). The number of vertices of the
graph will be determinate by:
n
n ¦ numAttributesPerObject (ni )
i 1
Figure 8 Different disjoint relations with the same 9intersection matrix [3]
Using the V9I model is possible to distinguish disjoint
relations since each object has limited neighbors instead
of having relations with all other objects. In figure 9 we
Ø
Ø
-Ø
9-intersection
iii
Chen et al. [3] proposed the Voronoi-based 9instersection model (V9I) as a modified version of the
point-set-based 9-intersection model to improve this
situation. The modification is made by replacing the
exterior of a spatial object (complement) with its Voronoi
region. The Voronoi region of an entity has a special
meaning, the influence region of itself and is defined as
the area containing all locations closer to itself than to any
other. The interaction model based on Voronoi diagrams
can be described as an extension of the 4 and 9intersection topological model.
Ø
Ø
-Ø
Int
Int
Bou
Ext
Ext
AºŀB¯
AŀB¯
A¯ŀB¯
However, this model fails to distinguish certain
disjoint relations and also to identify the topological
relations between two entities with holes [3].
Object A
Object B
Bou
AºŀB
AŀB
A¯ŀB
Object A
Int
AºŀBº
AŀBº
A¯ŀBº
present an example of the result matrixes using the 9intersection and the Voronoi 9-intersection model of two
disjoint objects. In the second case exterior A – exterior B
is the only one relation that is not empty.
Where n is the number of spatial objects included in
the dataset.
Edges will represent the spatial relations between two
particular objects (binary relations). The capabilities of
the model to represent the relation between these objects
will be of great impact in the results of the data mining
processes. The world is described by objects and the
relation between the objects, we can figure out the
relations as the elements describing the interaction of the
objects with each other. The number of edges of the
graph will be determinate by:
n
¦ numAttributesPerObject ( n ) num Re lationsAmongObjects
i
i 1
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
The proposed model is shown in figure 10 where we
have two spatial objects which are connected through
topological, distance, and direction relations. This
knowledge representation has the potential to create
graphs using both spatial and non-spatial data and also the
spatial relations between the spatial objects.
B
A
house
J
C
river
H
road
F
I
D
E
G
value
value
bu te
a tt ri
b u te
a ttri
a tt ri
b u te
at tribu
value
te
.
value
direction relation
Spatial
object
topological relation
Spatial
object
distance relation
Figure 10 Proposed schema
Combining spatial and non-spatial data in a dataset is
one of the strengths of the model. Some mining
approaches [22] apply data mining techniques first to the
non-spatial data and next to the spatial data or in inverse
order. We are proposing to apply data mining techniques
over datasets including both spatial and non-spatial data
as a whole.
Figure 11 shows an example of a database composed
of ten objects and their spatial relations. There are seven
houses (objects A, B, C, D, E, F, and G), a lake (object H,
a river (object J), and a road (object I). As we can see,
there are five houses near to the lake; two of them touch
the boundary of the lake. The river touches the boundary
of the lake and the boundary of the road. Additionally
there are two houses near to the road, but not near to the
lake like the other ones.
Looking at this figure, we see that most of the houses
are near to the lake and we may generalize this
distribution of the houses as a pattern (i.e. most houses
are located near a place where there is water). Non-spatial
analysis may answer questions like what the
characteristics of houses near to a lake are (i.e. the houses
are built using some special material, which are their
safety restrictions), type of soil where the houses were
built. On the other hand, spatial analysis may answer
questions like where the clusters of houses are, and what
the distribution of the objects in the area of analysis is.
lake
Figure 11 Spatial database representing object of the
real world
By using the proposed graph-based model we generate
a graph like the one shown in figure 12. The vertices in
the graph represent either the spatial objects (i.e. house,
lake, road, and river) or the attributes describing the
objects (i.e. object’s name). Following the schema there
are twenty vertices in the graph; ten vertices represent the
spatial objects and the other ten represent their attributes.
object
name
A
object
name
meet
object
D
object
near
name
meet
H
touch
meet
object
name
object
name
C
meet
near
near
B
object
near
object
name
name
F
I
near
E
object
name
J
object
name
G
Figure 12 Representing spatial data, non-spatial data
and spatial relations in graph format
The edges in the graph represent either the spatial
relations between the objects or the name of an attribute
of the object (i.e. name). For example the spatial relation
between the object lake and the object road is represented
by the edge label as “touch”, telling us that there is a
spatial relation between the objects and more specific that
their boundaries are touching. The number of edges in the
graph will be the number of spatial relations between the
objects plus the number of attributes describing each
object.
Once we have created the graph, it will be used as data
input for a graph-based data mining system (i.e. the
Subdue system). As we mentioned in section 5, the
Subdue system can work with datasets from any domain
that can be represented as a graph (graph-based data
representation), but it has not been tested with data from a
geographic database; consequently, we are proposing to
analyze this system in order to know its capabilities (if it
is required we can add Subdue the necessary capabilities
to deal with geographic data) to mine spatial and nonspatial data as well.
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
7. Conclusions
In this paper we propose a graph-based data
representation for spatial and non spatial data including
spatial relations between objects (topological, distance,
and direction relations). The model will enrich the spatial
data mining process because it will allow creating
datasets composed of three basic elements used in a
geographic representation.
The 4-intersection model is used to describe the
topological relations. Our idea in a second phase is to
integrate the 9-intersection model, and extend it using the
Voronoi-based 9-intersection model.
We already tried the Subdue system with non-spatial
data (using a spatial dominant approach [24][25]) and
now we are improving the results by applying Subdue to
datasets composed of spatial and non-spatial data.
As we mentioned, our proposal consists of generating
the capability to analyze geometric attributes among real
world elements to find behaviors and regularities as well.
Our methodology will include mechanisms of geometric
data processing in order to represent them in the graph
model, where data with traditional attributes and
geometric attributes are combined to describe a regular
behavior of elements of the real world.
9. Acknowledgement
Project 38257-H. Habitar y vivir. Análisis del espacio
habitacional de la ciudad de Puebla 1690-1890.
Universidad de las Américas Puebla’s excellence
scholarship.
8. References
[1] Adhikary, Junas. Knowledge Discovery in Spatial
Databases - Progress and Challenges. School of
Computing Science, Simon Fraser University. 1996.
[2] Aurenhammer, Franz. Voronoi diagrams – a survey of
a fundamental geometric data structure, ACM Computing
Surveying. 1991.
[3] Chen, Jun, Zhilin LI, Chengming Li, C. M. Gold.
Describing Topological Relations with Voronoi-based 9Intersection Model. 1999.
[4] Chen, Ming-Syan, Jiawei Han, Philip S. Yu. Data
Mining: An overview from Database Perspective. 1996.
[5] Egenhofer Max J. A model for detailed binary
topological relationships. National Center for Geographic
Information and Analysis and Department of Surveying
Engineering. Department of Computer Science,
University of Maine. 1993.
[6] Egenhofer, Max J., J. R. Herring. Categorizing binary
topological relationships between regions, lines, and
points in geographic databases. Technical Report,
Department of Surveying Engineering, University of
Maine, Orono. 1991.
[7] Egenhofer Max J., Robert D. Franzosa. On the
equivalent of topological relations. Research Article, Int.
J. Geographical Information Systems. 1995.
[8] Ester, Martin, Alexander Frommelt, Hans-Peter
Kriegel, Jörg Sander. Spatial Data Mining: Database
Primitives, Algorithms and Efficient DBMS Support.
Submitted to Special Issue on: “Integration of Data
Mining with Database Technology”, Data Mining and
Knowledge Discovery, an International Journal, Kluwer
Academic Publishers. 1999.
[9] Ester, Martin, Alexander Frommelt, Hans-Peter
Kriegel, Jörg Sander. Algorithms for Characterization and
Trend Detection in Spatial Databases. Proceedings of the
4th International Conference on Knowledge Discovery
and Data Mining (KDD 98), New York City, NY. 1998.
[10] Ester, Martin, Hans-Peter Kriegel, Jörg Sander.
Spatial Data Mining: A Database Approach. Proceedings
of the Fifth Int. Symposium on Large Spatial Databases
(SSD 97), Berlin, Germany, Lecture Notes in Computer
Science, Springer. 1997.
[11] Fayyad, Usama, G. Piatetsky-Shapiro, P. Smyth, R.
Uthurusamy, Eds. Advances in Knowledge Discovery
and Data Mining. AAAI/MIT Press, Menlo Park, CA.
1996.
[12] Fayyad, Usama, P. Smyth. Image Database
Exploration: Progress and Challenges. In Proceedings of
1993 Knowledge Discovery in Databases Workshop.
AAAI Press, Menlo Park, CA. 1993.
[13] Frawley, W. J., G. Piatetsky-Shapiro, C. J. Matheus.
Knowledge Discovery in Databases: An Overview. In
Piatetsky-Shapiro G., W. J. Frawley. Knowledge
Discovery in Databases, AAAI/MIT Press, Menlo Park.
1991.
[14] Han, jiawei, Yandong Cai, Nick Cercone.
Knowledge Discovery in Databases: An attribute-oriented
approach. Proceedings of the 18th International
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE
Conference on Very Large Databases (VLDB 92), British
Columbia, Canada. 1992.
Proceedings of Far East Workshop on Geographic
Information Systems, Singapore. 1993.
[15] Holder, L. B., D. J. Cook, J. Gonzalez, and I. Jonyer.
Structural Pattern Recognition in Graphs, to appear in
Pattern Recognition and String Matching (D. Chen and X.
Cheng, eds.), Kluwer Academic Publishers, 2002.
[23] Ng, Raymond T., Jiawei Han. Efficient and Effective
Clustering Methods for Spatial Data Mining. Proceedings
of the 20th Very Large Databases Conference (VLDB 94),
Santiago, Chile. 1994.
[16] Kaufman, Leonard, Peter J. Rousseeuw. Finding
Groups in Data: An Introduction to Cluster Analysis,
John Wiley & Sons, Inc. 1990.
[24] Pech Palacio Manuel. Tesis para obtener el grado de
Maestro en Ciencias con Especialidad en Sistemas
Computacionales. Departamento de Ingeniería en
Sistemas Computacionales. Universidad de las Américas,
Puebla. Mayo 2002.
[17] Knorr, Edwin M., Raymond T. Ng. Finding
Aggregate Proximity Relationships and Commonalities in
Spatial Data Mining. IEEE Trans. Knowledge and Data
Engineering. 1996.
[18] Kolatch, Erica. Clustering Algorithms for Spatial
Databases: A Survey. Department of Computer Science,
University of Maryland, Collage Park. 2001.
[19] Koperski Krzysztof, Jiawei Han. Discovery of
Spatial Association Rules in Geographic Information
Databases. Proceedings of the 4th International
Symposium on Spatial Databases (SSD 95), SpringerVerlag, Berlin. 1995.
[20] Koperski, Krzysztof, Jiawei Han, Nebojsa
Stefanovic. An efficient Two-Step Method for
Classification of Spatial Data. Proceedings of the
Symposium on Spatial Data Handling (SDH 98),
Vancouver, Canada. 1998.
[21] Laurini Robert, D. Thompson. Fundamentals of
Spatial Information Systems, Academic Press. 1992.
[25] Pech Palacio Manuel, Sol David, González Jesús.
Adaptation and Use of Spatial and Non-Spatial Data
Mining. Proceeding GEOPRO 2002, Instituto Politécnico
Nacional. 2002.
[26] Sheikholeslami, Gholamhosein, Surojit Chatterjee
and Aidong Zhang. WaveCluster: A Multi-Resolution
Clustering Approach for Very Large Spatial Databases.
Proceedings of the 24th Very Large Databases Conference
(VLDB 98), New York, NY. 1998.
[27] Subdue System. University of Texas in Arlington.
Internet site, visit last time February 2003.
http://ailab.uta.edu/subdue/.
[28] Zhang, Tian, Raghu Ramakrishnan, Miron Livny.
BIRCH: An Efficient Data Clustering Method for Very
Large Databases. Proceedings of the 1996 ACM
SIGMOD International Conference on Management of
Data, Montreal, Canada. 1996.
[22] Lu, Wei, Jiawei Han, Beng Chin Ooi. Discovery of
General Knowledge in Large Spatial Databases.
Proceedings of the Fourth Mexican International Conference on Computer Science (ENC’03)
0-7695-1915-6/03 $17.00 © 2003 IEEE