Lecture 3 Data Entry and Preparation
Lecture 3 Data Entry and Preparation
Lecture 3 Data Entry and Preparation
A. Arko-Adjei
Department of Geomatic Engineering
KNUST, Kumasi, Ghana
[email protected]
March 2013
Course content
• Introduction to GIS
• Spatial data types and representation
• Data input and methods of data capture
• Spatial referencing
• Fundamentals of remote sensing
• Sensors and platforms
• Image data characteristics and image interpretation
• Remote sensing applications
2
Lecture overview
• Spatial data input
• Examples of existing data
• Spatial data formats
• Digitizing
• Data preparation
• Clearing house
3
Spatial Data Input
4
Spatial Data Input
Direct spatial data capture
•Direct observation of the relevant geographic phenomena.
•This can be done through
• ground-based field surveys, (GPS, theodolites, tapes, etc.)
• remote sensing
• photogrammetry
5
Spatial Data Input
Direct spatial data capture
•Ground-based techniques remain the most important source
for reliable data in many cases.
•Data
which is captured directly from the environment is
known as primary data.
•Core concern with primary data is to know its properties -
knowing the process by which it was captured, the
parameters of any instruments used and the rigor with which
quality requirements were observed.
6
Spatial Data Input
• Remotely sensed imagery is usually not fit for immediate use,
as various sources of error and distortion may have been
present, and the imagery should first be free from these errors
• In practice, it is not always feasible to obtain spatial data by
direct spatial data capture.
• Factors of cost and available time may be a hindrance, or
previous projects sometimes have acquired data that may fit
the current project’s purpose.
• The increasing availability of geographic data, and the
growing pressure on organisations to perform more
efficiently, causes the common practice to use data from
existing sources, such as existing maps and digital data sets
7
Indirect spatial data capture
• Spatial data can also be sourced indirectly
• This type of data is known as secondary data
• Sources of secondary data
• data derived from existing paper maps through scanning
• data digitized from a satellite image
• processed data purchased from data capture firms or
international agencies, and so on.
8
Examples of existing data sets
9
Spatial data formats
10
Spatial data formats
11
Digitising
• A traditional method of obtaining spatial data is through
digitizing existing paper maps.
• Digitising is the conversion of an analogue map into a
digital vector map
• It is cost-effective method of data capture
• Before adopting this approach, one must be aware that
positional errors already in the paper map will further
accumulate, and one must be willing to accept these errors.
• A number of digitising techniques exist
12
Digitising process
Input
document
Vectorization
Process
Spatial
Database
13
On tablet manual Digitising
Digitising tablet
14
On-screen manual Digitising
Scanned image
Cursor
15
On-screen and on-tablet digitizing
• In both approaches, an operator follows the map’s features
(mostly lines) with a mouse device, thereby tracing the
lines, and storing location coordinates relative to a number
of previously defined control
• The function of these points is to ‘lock’ a coordinate
system onto the digitized data:
• Control points on the map have known coordinates, and by
digitizing them we tell the system implicitly where all
other digitized locations are.
• At least three control points are needed, but preferably
more should be digitized to allow a check on the positional
errors made.
16
Map registration using control points
17
On-screen versus Manual Digitising
18
Digitising of features
Points
Lines/Arcs
Point X,Y coordinates
1 2,4 : 3,3: 4,2: 5,2: 6,1: 7,2.5
2 1,4: 2,3: 3,2: 4,1
1
2
Polygons
Copyright: A. rko-Adjei 19
Semi-automatic and automatic digitizing
• Another set of techniques also works from a scanned
image of the original map
• This techniques uses the GIS to find features in the image
• These techniques are known as semi-automatic or
automatic digitizing, depending on how much operator
interaction is required.
• This procedure is less labour-intensive, but can only be
applied on relatively simple sources.
• If vector data is to be distilled from this procedure, a
process known as vectorization follows the scanning
process.
20
Scanning processes
• The basic principle is a light source which illuminates
the document, and a sensor which measures the
intensity of the reflected or transmitted light
• Scanners are of various types with various resolutions
• The minimum required resolution depends on the details
in the map and the digitising technique
• 200-300dpi for manual on-screen map digitising
21
Scanner Output
• The scanner output is only a digital copy of the
source document resolved into a matrix of cells
(pixels)
• Data are not structures into classified and coded
objects
• To obtain this, the data have to be vectored and
further structured
22
Vectorization
• The conversion from a raster to vector
• The process converts the pixel values of the scanned
document to points, lines and polygons with attributes
equivalent to the pixel values
• The data has to be structured after vectorization
• Splitting lines to form line segments and nodes
• Joining line segments to form objects
• Object coding
23
Selecting a digitising technique
• The technique to be preferred depends on
• quality of the map sheet
• contents of the map sheet
24
Selecting a digitising technique
25
Digitising errors
• Some common digitizing errors that occur in GIS
include
• undershoots: Failure to close a polygon
• overshoots: Going beyond the entity you were
supposed to connect to.
• rubber sheeting
based on the movement of known control points
to new locations
comparing accurate ground survey data and
aerial survey
Digitising errors
• Some common digitizing errors that occur in GIS
include
• Edge matching
Results of distorted edges from maps digitized at
different times or with different coordinate systems.
Causes include: changes in humidity; inherent
digitizer tablet inaccuracies; and missing or
overlapping map coverage
Data preparation
• Spatial data preparation consists of editing data that is
to be entered into the GIS database
• Vector data may require a lot of time-consuming
editing, such as the trimming of overshoots of lines at
intersections, deleting duplicate lines, closing gaps in
line, and generating polygons
• Data may need to be vectorized or rasterized to match
existing data sets
• Additionally, processing includes associating attribute
data with objects through either manual input or
reading digital attribute files into GIS
28
Clearinghouse
• A clearinghouse is a distributed set of computer servers
connected with a network of system that produce, manage
and use spatial data.
• The availability of metadata (data descriptions) allows
users to determine what spatial data exits, where to find
the data they need, evaluate the usefulness of the data for
their applications, and obtain or order the data as
economically as possible.
• Each data provider describes available data and provides
these metadata over the network. Besides metadata, the
data provider also offers access to his geographic data.
29