3 Spatial Data Modelling

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Spatial Data Modelling: Entity definition, Raster data structures, Vector data structures,

Modelling surfaces, Modelling networks, layer based approach


Raster and Vector Data Structures
Like any complex data structure, raster and vector data have a myriad of different realizations that vary
in complexity through use, appearance, format, and file size. Although distinctly different, these
affiliate data structures share two characteristics: (1) They visually represent real-world features, and
(2) they are subject to orientation within the real world. By satisfying both of these characteristics,
geographic data are born and made interoperable with other geographic data sources within GIS.
Raster data structures characterize continuous data (such as imagery) and are exceptionally strong
where boundaries and point information are not well defined. Raster data provide data as a pixel grid,
whereby each pixel or cell is a feature capable of retaining properties and attributes. These pixels
approximate pictures and images in an impressionistic way, with all of the smallish, monothematic cells
contributing to a greater whole. Adding further identity, a raster image can vary in file format, color
representation, resolution (size of pixels/number of pixels per set area), and potential properties. Vector
data are a bit different. Vector data structures characterize discrete data (such as roads, pipelines and
topographic features) and are exceptionally strong where distinct boundaries and point information are
well defined. This data structure is constructed on ordered two- and three-dimensional coordinates
([x,y] and [x,y,z], respectively). Features are represented as geometric shapes defined through single or
grouped coordinates on a set grid.

Raster versus vector data representation


Raster data, for example, offer a truly simple data structure that involves a grid of row and column
data. This simple grid structure allows for easy raster image analysis, as well as analysis among
multiple images. Raster modeling is also much easier to implement due to the single-value cell
structure and relatively simple software programming. These advantageous raster capabilities are
shadowed only by rasters native weaknesses. Disadvantages to raster data include general spatial
inaccuracies and misrepresentations, low resolution, and massive data sets that require significant
processing capability. The lack of accurate topology is also a major raster-based limitation. Similarly,
vector data offer their own variation of modeling strengths. Vector data, for instance, are spatially
accurate and support a better, higher resolution than the raster data model. The ability to provide
topology or feature relationships is a definite advantage, as well as the minimal data storage
S4 GIS Univ Kerala

requirements.
Although seemingly ideal, vector data have weaknesses. Due to the complex data structure,
vector data require a greater and more powerful processing capability. With this comes the need for
better, faster workstations to minimize the data processing times that typical computers face. Inevitably,
the costs to run a vector GIS and expeditiously process complex geographic data sets can become
highly expensive.

S4 GIS Univ Kerala

Vector Feature Geometry


Often it is necessary to depict real objects as features on a map and designate object positions within a
GIS. These features can range from the simplistic (linear transmission lines) to the complex (a
multibranched, nonlinear river). To detail objects as features in a GIS, we must choose the best data
model. In most cases, features are defined using the vector data model. Given the nature of raster data,
features are ambiguous in shape and general in position. For applications where specific form or
position is not needed, raster may prove easiest. Some possible applications where raster data can be
used to define features is on nonprecise maps, such as visitor maps of an amusement park, or
macroscopic markers on an overview map, such as a colored box for approximate position. Alternately,
given the accurate, positional nature of vector data, features are best represented by coordinates and
geometry. Real-world objects can be represented as individual or a group of geometric shapes called
feature geometries. In any geospatial platform, there are three primary types of feature geometries:
points, lines, and polygons. As a subset of these three primary types there exists a fourth geometric
feature called a polyline.
A point is an individual position defined as a vector x-y-z coordinate or as one raster pixel. Lines are
two connecting points with two distinct coordinates in vector or a linear block of pixels in the raster
model. Polygons are a grouping of vector coordinates connected in a sequential fashion or a group of
pixels forming the objects general shape. Lastly, the subset polyline (known also as an arc) is a
connected string of vector points or raster pixels. Often a polyline involves two lines sharing a same
point or pixel (vertex). Vector features are constructed on ordered pairs of vertices, whereby these
ordered pairs reside in the design plane having known x, y, z locations. The positions of these ordered
pairs are recorded in the vector file and often encoded as binary. These vertices are objects and can be
grouped to form features that exist as independent entities. This is in contrast to the raster model, where
the entire image is an object. In raster, features must first be sampled and then represented as image
pixels, which, as we already know, visually approximate the features. Raster images have no
independent vector ordered pairs, the actual building blocks of feature-driven spatial information.
Vector data for feature geometry comprise positional coordinates and, for some, other position-defining
data, such as inside/outside and left/right.
The point depicts a discreet position in space, defined by an individual geospatial coordinate. The line
is defined by two geospatial coordinates. The open-ended third feature is a polyline, which comprises
two vector lines sharing a common point (vertex). Interesting enough, polylines are defined not only by
geospatial coordinates, but also by left and right characteristics that indicate whether other features are
directly left or right of the polyline. Growing in complexity, the remaining two geospatial feature
geometries are closed features and depict specific areas. The polygon involves numerous vector points
that are connected in sequence. Polygons are defined with inside and outside characteristics that
delineate whether any overlapping geospatial feature geometries exist inside the polygons boundaries.
The most complex of the geospatial feature geometries is the doughnut or polygonal hole. A doughnut
comprises a polygon within another polygon. Polygon 1 forms the boundary area and polygon 2 serves
as the cutout within polygon 1. Think of a doughnut, whereby polygon 1 is the outermost doughnut
boundary and polygon 2 is the cutout in the doughnut center (doughnut hole). Each polygon enables
coordinate characteristics for inside and outside elements. To uniformly structure, manage, and
manipulate feature geometries like the ones just discussed, a GIS stores vector data in geospatial
feature file formats.

S4 GIS Univ Kerala

Common feature file formats, such as the shapefile developed by Environmental Systems Research
Institute (ESRI) or the TAB file developed by MapInfo, are basic geometric containers compatible with
an overwhelming majority of GISs. These geographic dataset types store nontopological vector (or
coordinate) geometry in the form of real-world spatial features, as well as links to attribute information
for these respective objects.
The Esri shapefile, or simply a shapefile, is a popular geospatial vector data format for geographic
information system software. It is developed and regulated by Esri as a (mostly) open specification for
data interoperability among Esri and other GIS software products. Shapefiles spatially describe vector
features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each
item usually has attributes that describe it, such as name or temperature. A shapefile is a digital vector
storage format for storing geometric location and associated attribute information. This format lacks the
capacity to store topological information. The shapefile format was introduced with ArcView GIS
version 2 in the early 1990s. It is now possible to read and write shapefiles using a variety of free and
paid programs.
Shapefiles are simple because they store the primitive geometric data types of points, lines, and
polygons. They are of limited use without any attributes to specify what they represent. Therefore, a
table of records will store properties/attributes for each primitive shape in the shapefile. Shapes
(points/lines/polygons) together with data attributes can create infinitely many representations about
geographic data. Representation provides the ability for powerful and accurate computations.
While the term "shapefile" is quite common, a "shapefile" is actually a set of several files. Three
individual files are mandatory to store the core data that comprise a shapefile: .shp, .shx, and .dbf. The
actual shapefile relates specifically to .shp files but alone is incomplete for distribution, as the other
supporting files are required.
Coverage
Designed by ESRI for ArcInfo; Implementation of the vector topologic data model; Closed file
format; Each coverage is a directory, with numerous files that store feature geometry, projection,
registration, etc.; Attribute data is stored in a separate INFO directory, which stores all attribute data for
all coverages in its parent directory.

S4 GIS Univ Kerala

Raster Image Structures


The nature of images, such as aerial photographs and base maps, involves a continuous array of data.
As discussed, the raster model provides the best solution for continuous data. Due to the lack of
specified feature boundaries, point locations, and multiple objects, vector data are not intrinsically fit to
handle the image information. A vector data representation of such an image produces a complex and
often massive image structure. Unlike a vector format, the raster palette is much more desirable for
applications requiring the use of photographs or digital scans. Raster imagery instantiates outrageously
complex geometries quite easily with the use of attributed image pixels. The essence and value of raster
image structures are that raster represents real-world information with native visual propertiesa feat
vector imagery cannot reproduce.
Because raster images are physically continuous by nature, a reasonably accurate transformation
process can be applied to fit raster data onto the real world. In brief, images are rasterized or digitally
transformed to raster data through a matrix of pixels. Photographs are typically scanned with a set
image resolution defined by pixels per inch (ppi), more commonly known as dots per inch (dpi). With
the use of a GIS, the organic transformation into a raster image can enable users to characterize
previously nonattributed existing geographic data material. In fact, solutions produced from this
process result in alternative, often unexplored, geographic data sets. There are times when an existing
raster image is not at the desired resolution for the present application and an image adjustment is
necessary. Manipulating resolution within raster images is a technique to adequately control and
manage the amount of raster data to be processed and the overall image quality. As mentioned earlier,
raster data resolution is the result of the amount of row and column pixels used to define an image.
Raster image dataset size is directly proportional to the amount of raster pixels used to define an image.
The rule of thumb is: The more pixels in the grid, the higher the image resolution, quality, and dataset
size; the fewer pixels in the grid area, the lower the image resolution, quality, and dataset size.
Modifying the images resolution presents varying results within the image. You can easily transform a
high resolution raster into a low resolution raster without desecrating image quality. The image size
often remains unchanged through an intelligent interpolation that transforms a grouping of pixels into
one pixel. However, it is not always easy to reverse this transformation. Taking a low resolution into a
high resolution form is not always feasible. To retain moderate image quality, a high resolution raster
will take on a reduced, fractional size to that of the original low resolution raster. If image size remains
unchanged, the new high resolution image will be relatively unusable and indistinguishable in content.
Care should always be taken when modifying the resolution of raster images. In actuality, geographic
information obtained from these types of organic processes may be of critical relevance to future
projects and unwittingly provide solutions to problems not yet known. Many times these unforeseen
criticalities involve complications put forth by the presence of precious natural resources on a project.
The manifestation (or realization) of these transformations is the fundamental building block for
geospatial systems and makes available opportunity for other advanced GIS techniques.

S4 GIS Univ Kerala

You might also like