Unit 3 Attribute Validation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

UNIT 3: ATTRIBUTE VALIDATION: DOMAINS, SUBTYPES AND

RELATIONSHIP CLASSES

Introduction and Learning Objectives

The Geodatabase integrates the vast flexibility of the ArcInfo coverage and
incorporates a variety of new features, which ultimately make geospatial editing less
time-consuming and more intuitive. An adequate control of the integrity of data is very important
for the design of a database as well as for performing analyses.

This chapter is one of the most important in this course, as the student must learn the main
attribute validation rules in order to assign behaviour to the features. The tools to be used in this
chapter, ArcGIS Subtypes, Domains and Relationship Classes, are all oriented to achieve this goal.

Learning Outcomes

Upon successful completion of this didactic unit the student will:

- Understand Geodatabase behaviour and it advantages.


- Show the benefits editing. Achieve efficient editing.
- Create and use Subtypes and Domains.
- Know some applied examples of the use of these tools.

Relevance to the overall course and relation to other topics covered


In this unit the student will also learn how to create the main procedures of attribute validation as
well as the definition of topology rules to be seen in the next chapter: Domains, Subtypes and
Relationship Classes. This will entitle the student to grant behaviour to features as a first approach
to the definition of a data model.

Geodatabase can enforce the integrity of attributes through Domains and validation rules. By
indicating the relationships between objects and assigning values to predetermined datasets will
ensure that the information in the Geodatabase is as accurate as possible and that the deletion
of data in one part of the Geodatabase does not have a negative effect on data in another part.
(Viljoen et al, 2006).

The student will be able to see these advantages firsthand while editing features with a defined
behavior.

Attribute quality is improved through the use of Subtypes in combination with Domains and
Relationship Classes that access records from related tables during feature capture (Stanton et. al,
2005).
BASIC CONCEPTS AND DEFINITIONS
• Domain: Set of valid values for an attribute, there can be codified values or valid range.
• Subtype: Subtypes are a subset of features in a feature class, or objects in a table, that
share the same attributes. They are used as a method to categorize your data. Subtypes are
implemented by creating coded values and, therefore, must be a s s o c i a t e d with fields of
the data type short or long integer.
• Relate: Relating tables simply defines a relationship between two tables. The associated
data isn't appended to the layer's attribute table like it is with a join. Instead, you can access
the related data when you work with the layer's attributes.
• Join: Tool that join a table or layer to a layer (or a table to a table) based on a common
field. The records in the input layer or table view are matched to the record in the join table
view based on the join field and the Input Field when the values are equal. The join is
temporary (as is the layer) and will only last for the duration of the session.
• Relationship Class: Relationship Classes manage the associations between objects in one
class and objects in another. Objects at either end of the relationship can be features with
geometry or records in a table.

Concepts about Attribute Validation

Subtypes: according to the GIS dictionary from ESRI Support “In Geodatabases, a subset of
features in a feature class or objects in a table that share the same attributes. For example, the streets
in a streets feature class could be categorized into three Subtypes: local streets, collector streets, and
arterial streets.
Creating Subtypes can be more efficient than creating many feature classes or tables in a geodatabase.
For example, a geodatabase with a dozen feature classes that have Subtypes will perform better than
a geodatabase with a hundred feature classes. Subtypes also make editing data faster and more
accurate because default attribute values and Domains can be set up. For example, a local street
subtype could be created and defined so that whenever this type of street is added to the feature
class, its speed limit attribute is automatically set to 35 miles per hour”.

Subtypes are a subset of features in a feature class, or objects in a table, that share the same
attributes. They are used as a method to categorize your data. Subtypes are implemented by
c r e a t i n g coded v a l u e s a n d , therefore, must be associated with fields of the data type short
or long integer. It is necessary not to forget that the value of an attribute with Subtypes is stored
as a code not as a description (though in ArcGIS we can see the description of the corresponding
code), therefore, it will be essential for the definition of a particular feature subtype to do it over an
Integer- type field.
without Subtypes with Subtypes

Figure 1: Feature classes with and without Subtypes.

Using Subtypes brings great benefits to the possibility of assigning default values to certain Subtypes
of an existing feature, with the consequent reduction of editing. There is also the possibility of
generating Subtypes in an existing feature by creating it in ArcCatalog and defining it in ArcMap as
will be seeing later on. Another important advantage appears when editing the objects, as it is
possible to edit based on predefined codes and edit Subtypes separately without affecting the rest of
the feature.

By default, when adding a feature with defined Subtypes into ArcMap every subtype acquires a
symbology of its own (Figure 1).

Using Subtypes has its advantages:

- Adds intelligence to features.


- Add new features into subtype.
- Automatic attribution.
- Attribute Validation.

- Domains: According to the GIS dictionary from ESRI Support ―In a geodatabase, a mechanism
for enforcing data integrity. Attribute Domains define what values are allowed in a field in a
feature class or nonspatial attribute table. If the features or nonspatial objects have been
grouped into Subtypes, different attribute Domains can be assigned to each of the Subtypes”.

Therefore, Domains are established as valid values for a particular attribute. These valid values can
be defined in two different ways: coded Domains and range Domains. A range domain establishes a
value range that the attribute can adopt, while the coded domain establishes a series of fixed
values that the attribute can take.
Working with Domains in the attributes allows us to validate attributes. This validation is carried out
while editing and allows us to control the new editing process with the attribute elements of the
feature that were already pre-defined.

Working with Domains brings the following advantages:

- Prevent and locate attribute errors.


- Maintain consistent coding schemes.
- Facility data entry during edition.
- Copied during data conversion.
- Reduce database size.

Relationship Class: According to the GIS dictionary from ESRI Support ―An association or link between
two objects in a database. Relationships can exist between spatial objects (features), between non
spatial objects (rows in a table), or between spatial and non- spatial objects. An item in the
geodatabase that stores information about a relationship. A Relationship Class is visible as an item
in the ArcCatalog tree or contents view.”

A Relationship Class represents a step forward compared to the traditional Joins and Relates,
given that these joins or relations between tables are not consistent nor have an effect over the data;
furthermore, employing a geodatabase is not needed for their use. A Relationship Class creates a
persistent relation between tables, since it stores the needed information to link the participating
classes (Figure 2).

A classic example of a Relationship Class is the relation between roads of an area and their
respective attributes, such as regions, and road districts that are related in a way that it is easy to
recognize, e.g. the road type of each road by region and when edits are made in one of the
Relationship Class tables will affect the other Relationship Class tables.

Figure 2: Relationship Class between two tables in ArcCatalog.

These three elements (Subtypes, Domains, and Relationship Class) that provide behaviour to the
geodatabase are called attribute validation rules and are available in ArcGIS. There are also spatial
validation rules that represent the geometric networks, as well as other topology rules that we will
be seeing later on. All these tools are distinctive in the geodatabase.

Editing with attribute validation rules brings large advantages compared to the traditional‖ editing.
Sets of features with common attributes (Subtypes) are managed more effectively. Valid values
for attributes are listed automatically (coded Domains).
Attribute errors can be located (Domains and Subtypes). Default values can be assigned to the
features and locate spatial errors. Summarizing, editing with attribute validation rules enforces
valid relations, connections and a more efficient editing.
Validation rules tools will be the focus in this chapter; they are available both in ArcMap as well as
in ArcCatalog (Figure 3):
Figure 3: Tools for attribute validation in ArcToolbox.
Creating Domains, Subtypes and Relationship Classes in ArcGIS
Creating Subtypes:
Subtypes c a n b e c r e a t e d either from ArcMap or from ArcCatalog. It i s fundamental to
remember that Subtypes are defined at feature class level and always on an Integer-type field.

Therefore, in order to define a subtype, the first step is to create an Integer-type field in the attribute
table of the feature. To create Subtypes, we will be working with the feature class ―parcels that shows
some parcels in a fictitious municipality, to which we will define Subtypes based on the land use of
each parcel.

Land use has been categorized as: Industry, commercial, residential and municipal. These land uses
have been associated with the parcels through the text- type field. Consequently, it is necessary
to create a short Integer-type field beforehand that concentrates the codes of all
four land uses.

Once the short type integer field ―use_code is created, we will associate it in the Subtypes tab in
the Feature Class properties menu (Figure 4).

Figure 4: Subtypes tab in the Feature Class property menu (ArcCatalog).

The codes for all four land uses must be determined first in order for the Subtypes to store any
information. In this case we will have to edit the field with the subtype (use_code) based on the
equivalence (table 1) with the field ―Use.

Table 1: Use / use_code equivalence.


Use use_code
commercial 0
industry 1
municipal 2
residential 3

Once the subtype is edited it would be interesting to assign the land use ―municipal as default
subtype so that when new land is incorporated in the municipal area it will automatically have this
default type.

To make this assignation possible we must go to the Feature Class properties menu, as seen in
figure 4, or use the ―Set Default Subtype‖ tool on the ArcToolbox:

Data Management Tools> Subtypes> Set Default Subtype

We now have the subtype ready for editing. See that when adding a feature with defined
Subtypes into ArcMap the symbology is displayed in categories (Figure 5), each subtype code being
a category with its own legend.
Creating Domains

Domains are defined at Geodatabase level, contrary to Subtypes that are defined at Feature Class
level. Domains can be applied either on a field of a feature or a determined subtype which gives
these tools, working together, a great visibility for the right management and standardization of data.

As mentioned in the description, Domains can be either of Coded values or Range type. A domain of
Coded value type is a list of valid values for an attribute or Subtypes that during editing, either in
the attribute table or in the attribute editor window, is displayed as a drop-down list of valid values
for attribute or subtype. A domain-type range sets the maximum and minimum values for an
attribute and is useful for locating errors in attributes defined or auto validates features during
the editing session.

To create Domains, we will use the same Feature ―Parcels‖, which already has defined Subtypes.

The following Domains will be added, and then we will associate them with the previously defined
Subtypes. For the creation of the Domains, go to the tab in the properties of the Geodatabase. Since
the Domains are defined at Geodatabase level and can be applied to any feature
contained within it.

Figure 6: Workspace Domain Properties Window.

To define a domain, you must know the following settings in the menu properties
associated with the Geodatabase:
Field Type: ―text.
Domain Type: It could be either range or coded values; we will discuss this
issue latter during the course.
Maximum and minimum: If the domain type were Range then we World have to
set maximum and minimum values. This is not the case; keep the default value of 0.
Split Policy: This option is set when an object is divided by any tool and how is it
going to reflect on the attributes. Following, the available options:

o Default Value: When dividing the feature both new output objects will have
the default value of the domain.
o Duplicate: The new objects will inherit the value of the attribute from the input
feature.
o Geometry ratio: The new attribute is calculated based on the geometry of the
objects that has been divided to (eg area).

Merge Policy: This option is set when two geographical objects are merged.
The effect on the attributes in similar to the previous policy. The available options
are:
o Default value: When merging two features the attribute of the output feature
will take the default value.
o Sum values: When merging two features the attribute of the output feature
will take the combined value of the input features; this is only valid if the field
is the numeric type.
o Geometry Weighted: The output feature is a weighted sum of the input geometry.

We will now present an example of how to define a domain. We will not apply the policy options
(―merge policy‖ and ―split policy‖), keeping the values as "default value".
Now, let‘s create a domain for each of the Subtypes that have been previously defined with the
following structure (table 2), they are all text-type:

Subtype Domain Domain code Description


Residential Residential 0 single family
1 multi family
Industry Industry 0 processing
1 storage
2 garage
Commercial Commercial 0 services
1 shops
2 office
3 gas station
Municipal Municipal 0 schools
1 vacant lots
2 athletic facilities

Go to the tab ―Domains in the Database properties menu to introduce the settings shown in
Figure 7:
Figure 7: Creating Domains in the Geodatabase.

The next step is to implement the domain to each subtype. It can be applied to just one determined
attribute so it can only accept certain values. We will choose to apply the domain to both Subtypes
in order to see the combination of both. For this we need to go back to the ―Subtypes‖ tab in the
properties menu of the Feature Class (Figure 8):
Domains applied to fields or Subtypes

Figure 8: Applying Domains to Subtypes.

In this way the subtype "municipal" to which we will allocate the same named domain will only
accept the values: schools, vacant lots or athletics facilities; the rest of the values can not be edited
in that subtype, idem for the other Subtypes.
Creating a Relationship Class

A Relationship Class is a persistent link between records in two tables, origin and destination
tables. It is an element in the Geodatabase that stores information needed to link participating
tables. The tables could be single tables or tables of a Feature Class.

They provide a persistent connection between tables without the need to be physically joined.
They allow the editing of related records and provide greater integrity to the data during the
edition session: if an item is removed or added in the origin table the change will also affect the
destination table.

There are many interesting examples of this application such as:

- Tables of parcels and owners.


- Polygons of vegetation and dominant species.
- Supply network with an associated maintenance table.

Creating a Relationship Class in ArcCatalog is simple. It can be located either at Feature Dataset
level or within it.

To show how to create a Relationship Class we will go through an example in which there is
one Feature Class named ―Parcels. This Feature Class contains 5 polygons that represent parcels and
a table named ― Owners which have the data regarding the owners of those parcels. We will create a
Relationship Class taking the attribute table of ―Parcels‖ as the origin and ―Owners‖ as the destination.

To establish the relationship, the prerequisite is that the same attribute is found in the origin table
(primary key) and the destination table (Destination foreign key). It is important to remember that
when you import records to another table or feature class, new values are assigned ObjectID, losing
any relationships based on the original ObjectID values, so it is not advisable to use the ObjectID
attribute as a reference for the Relationship Class.

Follow the path to create a new Relationship Class:

A menu will be displayed where you can set the input (―Owners‖) table and the output table
(―Parcels‖). The type of Relationship Class must be defined here (Figure 9):

When using the option ―Simple relationship‖ if we delete a record from the origin table, it will not
be deleted in the destination table but will take the value NULL; but if we use the option
Composite relationship‖ the result will be that the destination table will indeed display the
change. Figure 10 summarizes this concept:
Figure 10: Effects of eliminating records when using different types of Relationship
Class
In the following window we will introduce the settings to define labels for the relationship as it is
traversed from the origin and destination table or feature class and vice verse, and the direction
into which the messages will be spread, or not, when editing a record that affects other.

The next window refers to the cardinality i.e. Relationship Class (Figure 11); this can be one to one,
one to many, or many to many as shown in the following snapshot from the tool itself:

Figure 11: Cardinality in Relationship Class.


In this case, the correct cardinality for the relationship between ―Parcels‖ and
―Owners‖ is ―many to many‖, given that several parcels can belong to the same owner and a
parcel can belong to several owners.

In the next option the user will decide whether or not to add any additional attribute to the table
that stores the Relationship Class

Finally the user must define the common fields in each of the tables (primary key field) and the
name that will take that field in the Relationship Class table. The last window shows a summary of
the definition of the new Relationship Class.

Well, we have briefly seen the process of creating Domains, Subtypes and Relationship Classes in
ArcCatalog; we will now show the tools that are available in ArcMap:
To create Subtypes go to the ArcToolBox:

Data Management Tools> Subtypes

The following describes the functionality of each tool found in this Toolbox:

Add Subtype: Adds a Subtype to a feature, it is necessary to define as the input table the
name of the table, to which the subtype will be added, as well as the code and description
of the Subtype.
Remove Subtype.
Set default subtype.
Set Subtype Field: It is necessary that the field is integer-type (long or short
integer).

To create Domains go to the ArcToolbox:

Data Management Tools> Domains

The following describes the functionality of each tool found in this Toolbox: (Figure 3
Tools for Attribute Validation):

Add coded Value to Domain: The Domain must be a code-type and it is


necessary to define it as the input for the Geodatabase, the name of the
Domain, the code of the new value and its description need to be specified.
Assign Domain to field: The Domain should have been previously defined.
Create Domain: Does not assign its values to a pre determined Geodatabase.
Input data for this tool: the working Geodatabase, the name of the Domain and its
description, as well as the field type upon the Domain is defined. It also allows the
use of ―Domain Type‖ and the options of ―merge policy‖ and ―split policy‖.
Delete Coded Value from Domain: It is only valid for Domains of the ―coded values‖
type.
Delete Domain.
Domain to Table: Yields as output a table from the domain of a field; Input data for this
tool: working Geodatabase, the name of the domain, the name of the
field in the table that will store the output code values using the "Field Code"
and the field name to include the description of the codes in the domain under the
"Description Field".
Remove Domain From Field.
Set Value For Range Domain: Assigns the maximum and minimum values for a domain-type
range, only need to define the Geodatabase, domain, and both
values.
Table to Domain: Converts a table into a domain, the table must have unique
values, so as the operation is conducted in a proper manner. Input data for this tool: input
table, the field that contains the code and its description, the working
Geodatabase and the name of the domain. It can optionally allow defining the description
of the Domain as well as the Update Option, given that if the
Domain already exists we can add it to the values already defined (Append) or rather replace
them for the ones in the table (Replace).

To work with Relationship Classes go to the ArcToolbox:

Data Management Tools> Relationship Class

Create Relationship Class: The settings are similar to those of ArcCatalog; such as Origin and
Destination tables, type of Relationship Class (simple or composite), cardinality, origin
primary key and origin foreign key.

Table to Relationship Class: Creates an attributed Relationship Class from the Origin,
Destination and Relationship tables. Settings to be define to use this tool: Origin and
Destination table, Output Relationship Class, Type of Relationship, forward path label,
backward path label, message Direction, Cardinality, Relationship Table, Map of attribute
Fields in Relationship Table and origin primary key and origin foreign key.
Practical application (Demo): Create a Domain in ArcMap from a map of land use:

In this example we will create a Domain from a non-defined Feature Class named land_use.
The working Geodatabase is Landuse.mdb.

The very first step is to use the dissolve tool upon the field ―use‖ of the Feature Class table so as
each record represents one land use (Figure 12):

Figure 12: Attribute table of “land_use_dissolve”

Then we will perform the conversion from table to Domain using the output table after the
dissolve operation (Figure 14). Follow the path:

Data Management Tools> Domains> Table to Domain

The last step is to assign the defined Domain to the field ―use‖:

Data Management Tools> Domains> Assign Domain to Field


Figure 14: Tools to create a domain from a table and assign it to a field.

Editing with Domains, Subtypes and Relationship Classes.

In this section will show the benefits of editing with validation rules for attributes versus traditional
publishing. Since these validation rules are from attributes, the main focus of this section is on
attribute editing, while the chapter about Topology will focus on the editing of features.

Editing Using Subtypes:


First, when we add an item to ArcMap with defined subtypes, the symbology is displayed by
categories representing each grade level by a single subtype. In the same way at the start of an editing
session on a feature with defined sub-types, a separate destination can be selected for each subtype
of the feature (Figure 15).

Figure 15: Target in editing with Subtypes

Similarly in the attribute editor as we introduce a value in the field that holds the Subtype, a drop
down list shows the available attribute values (Figure 16). The tool does not display the code for
each subtype but it shows its description.
Figure 16: Editor Dialog box in a feature with Subtypes defined

This "confinement" of the attribution of values forces homogeneity, and therefore the integrity of
the attributes preventing errors during the editing session since the user can not define any other
value field, e.g. by keyboard input.

Another application, using the field calculator, is to edit a field that includes a Subtype based on
its code (Figure 17). Although externally the table includes the description, this is very useful
because it can expedite the editing with long descriptions and during multi-user editing sessions; this
ensures, by editing an integer code, the homogeneity in the fields in order to gather the information
edited and published by several users.

On the other hand, if we create a feature in a layer with defined subtypes, the attribute with the
subtype will take the value that has been defined as Subtype Default.

Editing using range Domains:

Editing with range-type Domains has a great advantage of feature validation no matter if they have
been previously edited or are being edited in the moment. If we import a defined feature to
the Geodatabase and we assign it a range-type Domain, when adding it to ArcMap it is possible
to confirm if all records have been edited within a range of defined values; this can be done using
the editor tool ―Validate Features‖. Follow the path:
Editor > Validate Features
To validate features it is necessary to start an editing session and select all the features to be
validated.

When the validation is complete and some records do not meet the defined range the software
throws an error message and those features remain selected while the ones that do have the proper
value are no longer selected.

Figure 18 shows an example of this error message when one feature exceeds the defined tension
value for an electric municipal power line that must be between 220 and 380:
Figure 18: Validating range Domains.
This can be very useful to automatically validate features that have been previously defined.

If the feature has not been previously edited it allows editing values outside the defined range. Even
to store the edited value defined beyond the range it would be necessary to use the validation tool
so as the program display those features.

Editing using Coded Domains:

The Attribute dialog box to set the parameters for editing coded-type Domains is similar to the one
for editing with Subtypes (Figures 16 and 19), the attribute shows a drop down list with the valid
values to choose from, Domain list provides built-in validation.

Editing with Relationship Class

As we mentioned before when editing composite Relationship Classes, the elimination of a record
in the origin table causes the elimination of the object in the destination layer.

This ―consistent‖ relationship can not be achieved just with the ―Join and Relates‖ tools, that can
only link the tables but with no effect on the editing. Nevertheless, it is important to point out that
if the Relationship Class is define in ArcCatalog and the participating objects are adduced in
ArcMap the tables to not appear to be linked by default; it would be necessary to create a ―Join‖
between them both, so as creating or deleting records in one of the tables will be reflected on
the other one.

Relationship Classes help you access objects while you're editing. You can select an object,
then use the Attributes dialog box or table to find all related objects. Once you have navigated to
the related object, you can edit its attributes. Regardless of how deeply chained, all the related
classes are available for editing.
Because Relationship Classes are stored in the geodatabase, they can be managed with versions.
Versions allow multiple users to edit the features or records in a relationship at the same time.

The following table shows the differences between the three types of table link: Joins, Relates and
Relationship Classes (Table 3)

Clearly the potentiality of Geodatabase and Relationship Classes is greater than using
Join and Relates.
Conclusions and Summary
This chapter will serve as a first approximation to the Data Model where we will learn how to
provide behaviour to the features and create templates for its latter use.

In this chapter we have covered the main procedures to validate attributes and learn how to use
the available tools to create and edit features in both ArcMap and ArcCatalog.

This kind of work requires a proper definition of a data structure in order to optimize the workflow
and the editing of data to avoid future conflict within the data.

Recommended activities

- Exercise/lab 1: Define a structure to validate attributes.

- Exercise/lab 2: Geodatabase editing using defined validation rules.

- Theoretical exercise: A true/false test that summarizes the key concepts of the chapter.
Bibliography:
Heather I. Stanton, Stephanie O'Meara, James R. Chappell, Anne R. Poole, Gregory Mack, and
Georgia Hybels, 2005. Ensuring Data Quality using Topology and Attribute Validation in the
Geodatabase. U.S. Geological Survey Open-File Report 2005-1428.

S.J. Viljoen, E. Pretorius, C.H. Wessels and O.J. Gericke (2006). Creating a catchment management
information system (cmis): moving from a database to a geodatabase. Water Institute of Southern
Africa Biennial Conference and Exhibition (WISA 2006).

Additional Readings:

David Arctur, Michael Zeiler. (2004). Designing Geodatabases: Case Studies in GIS.
Environmental Systems Research Institute. Redlands, Calif.

Dawn J. Wright, Michael J. Blongewicz, Patrick N. Halpin, Jane Lubchenco, Joe Breman.
(2007). ArcMarine. Environmental Systems Research Institute. Redlands, Calif.

ESRI web help ArcGIS Desktop 9 (2007) from:


http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=welcome

Michael Zeiler. (1999) Modeling Our World: The ESRI Guide to Geodatabase Design.
Environmental Systems Research Institute. Redlands, Calif.

Nancy Von Meyer. (2004) GIS and Land Records: The ArcGIS Parcel Data Model.
Environmental Systems Research Institute. Redlands, Calif.

Keywords:

Attribute validation
Domain
Subtype
Relations

You might also like