DB2 V8.1 Family Fundamentals Certification Prep:: Part 4 of 6, Working With DB2 UDB Data
DB2 V8.1 Family Fundamentals Certification Prep:: Part 4 of 6, Working With DB2 UDB Data
DB2 V8.1 Family Fundamentals Certification Prep:: Part 4 of 6, Working With DB2 UDB Data
27 Feb 2003
This tutorial introduces you to Structured Query Language (SQL), and helps to give
you a good understanding of how DB2 Universal Database uses SQL to define
database objects (such as tables, views, or indexes) and to manipulate data in a
relational database. This tutorial is the fourth in a series of six tutorials that you can
use to help prepare for the DB2 UDB V8.1 Family Fundamentals Certification (Exam
700).
Section 1. Introduction
This tutorial is the fourth in a series of six tutorials that you can use to help prepare
for the DB2 V 8.1 Family Fundamentals Certification (Exam 700). The material in
this tutorial primarily covers the objectives in Section 4 of the exam, which is entitled
"Working with DB2 UDB Data." You can view these objectives at:
http://www.ibm.com/certify/tests/obj700.shtml.
DB2 installation is not covered in this tutorial. If you haven't already done so, you
can download and install a copy of IBM DB2 Universal Database, Enterprise Server
Edition. Installing DB2 will help you understand many of the concepts that are tested
on the DB2 UDB V8.1 Family Fundamentals Certification exam. The installation
process is documented in the Quick Beginnings books, which can be found at the
DB2 Technical Support Web site under the Technical Information heading.
All of the examples in this tutorial are based on the SAMPLE database, which comes
with DB2 Universal Database. As noted previously, you can download a trial version
of IBM DB2 Universal Database; however, the examples can be understood without
access to the product, as sample output is provided in most cases.
Most SQL statements contain one or more of the following language elements:
• Data types: The data type of a value determines how DB2 interprets that
value. DB2 supports a large number of built-in data types, and also
supports user-defined data types (UDTs). For information about DB2's
built-in data types, see Data types.
Data types
The built-in data types can be classified as numeric, character string, graphic string,
binary string, or datetime. There is also a special data type named DATALINK. A
DATALINK value contains a logical reference to a file stored outside of the
database.
• DATE: A DATE is a three-part value (year, month, and day). The range of
the year part is 0001 to 9999. The range of the month part is 1 to 12. The
range of the day part is 1 to n, where the value of n depends on the
month. The length of a DATE column is 10 bytes.
day part is 1 to n, where the value of n depends on the month. The range
of the hour part is 0 to 24. The range of both the minute part and the
second part is 0 to 59. The range of the microsecond part is 000000 to
999999. If the hour is 24, the minute value, the second value, and the
microsecond value are all zero. The length of a TIMESTAMP column is
26 bytes.
Special registers
A special register is a storage area that is defined for an application process by the
database manager and is used to store information that can be referenced in SQL
statements. DB2 Universal Database currently supports the following special
registers:
To illustrate how useful special registers can be, consider the SALES table, which
has a column named Sales_Date. The following statement inserts a new row into the
SALES table, setting the sales date to the value of the CURRENT DATE special
register:
For more information about the INSERT statement, see Using the INSERT
statement to add new rows to tables or views.
One set of system catalog views belongs to the SYSCAT schema; these views are
not directly updatable. A second set of views, representing a subset of views
belonging to the SYSCAT schema, contain statistical information that is used by the
optimizer. The optimizer is a component of the SQL compiler that chooses an
access plan for a DML statement by modeling the execution cost of alternative
access plans and choosing the one with the lowest estimated cost. Views belonging
to the SYSSTAT schema contain some updatable columns.
The system catalog views can be queried to obtain useful information about a
database. For example, the following statement uses the NOT LIKE predicate to
return the name of each user-defined table with an entry in SYSCAT.TABLES, along
with the number of columns in each of those tables, and table status (N = normal; C
= check pending):
CL_SCHED 4 N
DEPARTMENT 5 N
EMP_ACT 6 C
For more information about the SELECT statement, see Using the SELECT
statement to retrieve data from database tables.
Authorization
DCL is a subset of SQL. It is used to provide access control to database objects.
There are two levels of security to control access to database objects. The first level,
which controls access to the DB2 instance, is managed by the operating system.
This level is called authentication, and involves verifying a user's identity through a
valid user ID and password. The second level of security controls access to a
database on the server.
To access a database on the server, you must establish a connection between your
DB2 client and the database using the SQL CONNECT statement. The syntax of this
statement allows you to specify a user ID and password, which DB2 will use to
authenticate you. You can also request a password change by supplying your user
ID, old password, and new password twice. For example:
CONNECT TO sample USER shaman USING mypassword NEW newpassword CONFIRM newpassword
If you don't specify a user ID and password with the CONNECT statement, DB2 may
use the ID and password you logged on with at the client for authentication, or it may
prompt you to supply a user ID and password.
The following message tells you that you have made a successful connection:
• SYSADM: This authority level gives a user the ability to run utilities, issue
database and database manager commands, and control database
objects throughout the database manager instance. In addition to all of
the capabilities granted to the other authorities, users with SYSADM
authority can migrate a database, modify the database manager
configuration file, and grant DBADM authority.
Privileges are specific rights that can be granted to users, allowing them to work with
If you create an object, you have full access to that object. This is known as having
CONTROL privilege on the object. A user with CONTROL privilege on an object can
let other users have access to the object, and can give other users permission to
grant privileges on the object. Privileges can be granted or revoked using the SQL
GRANT or REVOKE statement.
To grant privileges on database objects, you must have SYSADM authority, DBADM
authority, CONTROL privilege, or have the WITH GRANT OPTION (a selectable
option on the GRANT statement) on that object. You must have SYSADM or DBADM
authority to grant CONTROL privilege to another user. You must have SYSADM
authority to grant DBADM authority.
For more extensive treatment of DB2 security, see the second tutorial in this series.
Schemas
A schema is a collection of named objects, such as tables, views, triggers, and
functions. Schemas provide a logical classification of objects in the database. A
schema name is used as the first part of a two-part object name. Consider for
example the name SMITH.STAFF. In this example, the fully qualified name of the
STAFF table includes the schema name, SMITH, to distinguish it from any other
table named STAFF in the system catalog.
The schema itself is a database object. A schema can be explicitly created using the
CREATE SCHEMA statement; it can also be implicitly created when another object is
created, if the user creating the object has IMPLICIT_SCHEMA privilege.
Schemas have privileges associated with them. This allows the schema owner to
control which users have the privilege to create, alter, and drop objects in the
schema. A schema owner is initially given all of these privileges on the schema, with
the ability to grant them to others. An implicitly created schema is owned by the
system, and all users are initially given the privilege to create objects in that schema.
A user with SYSADM or DBADM authority can change the privileges held by users
on any schema, even one that was implicitly created.
Specifying the keyword PUBLIC grants the privilege to all users. Some operating
systems allow users and groups to have the same name. In such cases, you can
specify the optional keyword USER or GROUP to distinguish them. For example:
GRANT INSERT, DELETE ON TABLE staff TO USER rosita WITH GRANT OPTION
In this example, the WITH GRANT OPTION gives Rosita the ability to grant the
INSERT or DELETE privilege to other users. Of course, if another user already has
SYSADM authority, DBADM authority on the database that contains the STAFF
table, or CONTROL privilege on the STAFF table, that user already has the authority
to grant the INSERT or the DELETE privilege on the STAFF table.
Specifying the keyword PUBLIC revokes the privilege from all users. Some
operating systems allow users and groups to have the same name. In such cases,
you can specify the optional keyword USER or GROUP to distinguish them.
In the following example, the optional keyword ALL is used to revoke all privileges
held by Joanna on the STAFF table:
Even if you've revoked privileges on an object from a group, you can't be sure you've
revoked privileges from each member of that group. If any members of the group
had previously been granted privileges as individuals or as members of another
group, they retain these privileges unless you issue additional REVOKE statements.
• Buffer pools
• Event monitors
• Functions
• Indexes
• Schemas
• Stored procedures
• Tables
• tablespaces
• Triggers
• Views
The system catalog is updated whenever you create a database object.
Consider the CREATE TABLE statement. This statement has a large number of
options that let you precisely define the kind of table you want to create. In its
simplest form, the CREATE TABLE statement requires only that you specify one or
more columns and associated data types. Consider this example:
This statement specifies the creation of a five-column table called ORG. The ORG
table is actually part of the SAMPLE database that comes with DB2. Each column
represents an attribute of the organization. Data pertaining to these attributes can be
collected and stored in the table. The first column, DeptNumb, cannot have NULL
values, because this column will represent a unique key for the table: if each value in
this column is unique, a specific value can be used to uniquely identify a row in the
table.
option).
Temporary tables must be explicitly (or will be implicitly) qualified by the schema
name SESSION, because each session that defines a declared table has its own
(possibly unique) description of that temporary table.
• Buffer pools
• Tables
• tablespaces
• Views
You cannot alter an index. If you want to change an index, you must drop it and then
create a new one with a different definition.
• Buffer pools
• Event monitors
• Functions
• Indexes
• Schemas
• Stored procedures
• Tables
• tablespaces
• Triggers
• Views
The DROP statement removes object definitions from the system catalog and
therefore from the database itself. Here is an example of the DROP TABLE
statement:
COMM
SALARY
YEARS
JOB
DEPT
NAME
ID
18357.50
7
Mgr
20
Sanders
-10
18171.25
8
Sales
Pernal
20
612.45
17506.75
5
Mgr
38
Marenghi
-30
To restrict the number of rows in a result set, use the FETCH FIRST clause. For
example:
You can retrieve specific columns from a table by specifying a select list of column
names separated by commas. For example:
Use the DISTINCT clause to eliminate duplicate rows in a result set. For example:
Without the AS clause, the derived column would have been named 2, indicating that
it is the second column in the result set.
• Find the names of staff members whose salaries are greater than
$20,000:
• List the name, job title, and salary of staff members who are not
managers and whose salary is greater than $20,000:
In this example, the percent sign (%) is a wild card character that
represents a string of zero or more characters.
A subquery is a SELECT statement that appears within the WHERE clause of a main
query, and feeds its result set to that WHERE clause. For example:
A correlation name is defined in the FROM clause of a query, and can serve as a
convenient short name for a table. Correlation names also eliminate ambiguous
references to identical column names from different tables. For example:
You can sort the result set in descending order by specifying DESC in the ORDER BY
clause:
The simplest join is one in which there are no specified conditions. For example:
This statement returns all combinations of rows from the ORG table and the STAFF
table. The first three columns come from the ORG tables, and the last four columns
come from the STAFF table. Such a result set (the cross product of the two tables) is
not very useful. What is needed is a join condition to refine the result set. For
example, here is a query that is designed to identify staff members who are
managers:
...
FROM org INNER JOIN staff
ON manager = id
...
The keyword ON specifies the join conditions for the tables being joined. Remember,
DeptNumb and DeptName are columns in the ORG table, and Manager_ID and
Manager are based on columns (ID and Name) in the STAFF table. The result set
for the inner join consists of rows that have matching values for the Manager and ID
columns in the left table (ORG) and the right table (STAFF), respectively. (When you
perform a join on two tables, you arbitrarily designate one table to be the left table
and the other to be the right.)
Outer joins return rows that are generated by an inner join operation, plus rows that
would not be returned by the inner join operation. There are three types of outer
joins:
• A left outer join includes the inner join plus the rows from the left table that
are not returned by the inner join. This type of join uses the LEFT OUTER
JOIN (or LEFT JOIN ) operator in the FROM clause.
• A right outer join includes the inner join plus the rows from the right table
that are not returned by the inner join. This type of join uses the RIGHT
OUTER JOIN (or RIGHT JOIN) operator in the FROM clause.
• A full outer join includes the inner join plus the rows from both the left
table and the right table that are not returned by the inner join. This type
of join uses the FULL OUTER JOIN (or FULL JOIN) operator in the
FROM clause.
More complex queries can be constructed to answer more difficult questions. The
following query is designed to generate a list of employees who are responsible for
projects, identifying those employees who are also managers by listing the
departments that they manage:
The first outer join gets the name of any project for which the employee is
responsible; this outer join is enclosed by parentheses and is resolved first. The
second outer join gets the name of the employee's department if that employee is a
manager.
This statement returns a list of sales dates from the SALES table. The SALES table
in the SAMPLE database contains sales data, including the number of successful
transactions by a particular sales person on a particular date. There is typically more
than one record per date. The GROUP BY clause groups the data by sales date, and
the MAX function (see Using functions to transform data ) in this example returns the
maximum number of sales recorded for each sales date.
Here, the YEAR function is used to return the year portion of date values, and the
SUM function is used to return the total in each set of grouped sales figures. The
grouping sets list specifies how the data is to be grouped, or aggregated. A pair of
empty parentheses is added to the grouping sets list to get a grand total in the result
set. The statement returns the following:
A statement that is almost identical to the previous one, but that specifies the
ROLLUP clause, or the CUBE clause instead of the GROUPING SETS clause, returns
a result set that provides a more detailed perspective on the data, it may, for
instance, provide summaries by location or time.
The HAVING clause is often used with a GROUP BY clause to retrieve results for
groups that satisfy only a specific condition. A HAVING clause can contain one or
more predicates that compare some property of the group with another property of
the group or a constant. For example:
This statement returns a list of salespeople whose sales totals exceed 25.
column.
• Use a VALUES clause to specify column data for one or more rows. For
example:
Or the equivalent:
INSERT INTO staff (id, name, dept, job, years, salary, comm)
VALUES
(1212,'Cerny',20,'Sales',3,90000.00,30000.00),
(1213,'Wolfrum',20,'Sales',2,90000.00,10000.00)
UPDATE staff
SET dept = 51, salary = 70000
WHERE id = 750
Or the equivalent:
UPDATE staff
SET (dept, salary) = (51, 70000)
WHERE id = 750
If you don't specify a WHERE clause, DB2 updates each row in the table or view!
If you don't specify a WHERE clause, DB2 deletes all the rows in the table or view!
This db2 command specifies the -td option flag, which tells the command-line
processor to define and to use @ as the statement termination character; the -v
option flag, which tells the command-line processor to echo command text to
standard output; and the --f option flag, which tells the command-line processor to
read command input from the specified file instead of from standard input.
and returns an output parameter called sql_state. The procedure body consists of a
single SELECT statement that returns the name and the total sales figures for each
salesperson whose total sales exceed the specified quota.
Most SQL procedures will accept at least one input parameter. In our example, the
input parameter contains a value (quota) that is used in the SELECT statement
contained in the procedure body.
Many SQL procedures will return at least one output parameter. Our example
includes an output parameter (sql_state) that is used to report the success or failure
of the SQL procedure. DB2 returns an SQLSTATE value in response to conditions
that could be the result of an SQL statement. Because the returned SQLCODE or
SQLSTATE value pertains to the last SQL statement issued in the procedure body,
and accessing the values alters the subsequent values of these variables (because
an SQL statement is used to access them), the SQLCODE or SQLSTATE value should
be assigned to and returned through a locally defined variable (such as the sql_state
variable in our example).
The parameter list for an SQL procedure can specify zero or more parameters, each
of which can be one of three possible types:
1. Declaring the number of result sets that the SQL procedure will return in
the DYNAMIC RESULT SETS clause.
2. Declaring a cursor in the procedure body (using the WITH RETURN FOR
clause) for each result set that will be returned. A cursor is a named
control structure that is used by an application program to point to a
specific row within an ordered set of rows. A cursor is used to retrieve
rows from a set.
3. Opening the cursor for each result set that will be returned.
Variables must be declared at the beginning of the SQL procedure body. To declare
a variable, you must assign a unique identifier to and specify an SQL data type for
the variable and, optionally, assign an initial value to the variable.
• The FOR structure executes a block of code for each row of a table.
• The LOOP clause executes a block of code multiple times until a LEAVE,
ITERATE, or GOTO statement transfers control outside of the loop.
• The RETURN clause returns control from the SQL procedure to the caller.
To successfully create SQL procedures, you must have installed the DB2
Application Development Client on the database server. (See the first tutorial in this
series for more on the Application Development Client.) DB2 converts SQL
procedure statements into an equivalent C application by using embedded SQL; this
means that before you can create any SQL procedures, you must also install and
configure a supported C compiler on the database server. If your database server
runs on a Windows operating system, but the environment variables for the C
compiler are not system variables, you must update the
DB2_SQLROUTINE_COMPILER_PATH DB2 registry variable on the database server:
• For Microsoft Visual Studio, update the DB2 registry variable with the path
for the vcvars32.bat file.
• For IBM VisualAge for C++, update the DB2 registry variable with the path
for the setenv.bat file.
If your database server runs on a UNIX-based system, DB2 generates an
executable script file ( $HOME/sqllib/function/routine/sr_cpath ) the first
time you compile a stored procedure. This script contains the default values for the
compiler environment variables on your operating system.
Do not include double quotation marks if you are using the command line
processor (CLP) in interactive input mode, characterized by the db2 =>
input prompt.
In this example, a value of 25 for the input parameter quota is passed to the SQL
procedure, as well as a question mark (?) place-holder for the output parameter
sql_state. The procedure returns the name and the total sales figures for each
salesperson whose total sales exceed the specified quota (25). The following is
sample output returned by this statement:
SQL_STATE: 00000
SALES_PERSON TOTAL_SALES
GOUNOT 50
LEE 91
"SALES_STATUS" RETURN_STATUS: "0"
Client applications written in any supported language can call SQL procedures.
Section 7. Summary
Summary
This tutorial was designed to introduce you to Structured Query Language (SQL),
and to some of the ways that DB2 Universal Database uses SQL to define database
objects and to manipulate data in relational databases. This tutorial has covered the
fundamentals of SQL, including SQL language elements, Data Control Language
(DCL), Data Definition Language (DDL), Data Manipulation Language (DML), and
SQL procedures.
Resources
• Good places to find additional information about DB2 Universal Database and
SQL are:
• IBM DB2 Universal Database SQL Reference, Vol. 1, Version 8 ,
SC09-4844-00; IBM DB2 Universal Database SQL Reference, Vol. 2,
Version 8 , SC09-4845-00. International Business Machines Corporation,
2002.
• DB2: The Complete Reference . Melnyk, Roman B., and Paul Z.
Zikopoulos, Osborne/McGraw-Hill, 2001.
• Check out the other parts of the DB2 V8.1 Family Fundamentals Certification
Prep series:
• DB2 V8.1 Family Fundamentals Certification Prep, Part 1 of 6: DB2
Planning
• DB2 V8.1 Family Fundamentals Certification Prep, Part 2 of 6: DB2
Security
• DB2 V8.1 Family Fundamentals Certification Prep, Part 3 of 6: Accessing
DB2 UDB Data
• DB2 V8.1 Family Fundamentals Certification Prep, Part 5 of 6: Working
with DB2 UDB Objects
• DB2 V8.1 Family Fundamentals Certification Prep, Part 6 of 6: Data
Concurrency