IBM Informix Performance Guide: Informix Product Family Informix
IBM Informix Performance Guide: Informix Product Family Informix
IBM Informix Performance Guide: Informix Product Family Informix
Informix
Version 11.70
SC27-3544-06
Informix Product Family
Informix
Version 11.70
SC27-3544-06
Note
Before using this information and the product it supports, read the information in “Notices” on page C-1.
Contents v
Reclaiming unused space within an extent . . . . . . . . . . . . . . . . . . . . . . . 6-26
Managing extent deallocation with the TRUNCATE keyword . . . . . . . . . . . . . . . . . 6-27
Defragment partitions to merge extents . . . . . . . . . . . . . . . . . . . . . . . . 6-28
Storing multiple table fragments in a single dbspace . . . . . . . . . . . . . . . . . . . . . 6-29
Displaying a list of table and index partitions . . . . . . . . . . . . . . . . . . . . . . . 6-29
Changing tables to improve performance . . . . . . . . . . . . . . . . . . . . . . . . . 6-29
Loading and unloading tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-29
Dropping indexes for table-update efficiency . . . . . . . . . . . . . . . . . . . . . . 6-32
Creating and enabling referential constraints efficiently . . . . . . . . . . . . . . . . . . . 6-32
Attaching or detaching fragments . . . . . . . . . . . . . . . . . . . . . . . . . . 6-34
Altering a table definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-35
Denormalize the data model to improve performance . . . . . . . . . . . . . . . . . . . . 6-42
Shortening rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-42
Expelling long strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-43
Splitting wide tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-44
Redundant data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-45
Reduce disk space in tables with variable length rows . . . . . . . . . . . . . . . . . . . . 6-46
Reduce disk space by compressing tables and fragments. . . . . . . . . . . . . . . . . . . . 6-46
Contents vii
Indexes for evaluating a filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-21
Effect of PDQ on the query plan . . . . . . . . . . . . . . . . . . . . . . . . . . 10-22
Effect of OPTCOMPIND on the query plan . . . . . . . . . . . . . . . . . . . . . . . 10-22
Effect of available memory on the query plan . . . . . . . . . . . . . . . . . . . . . . 10-23
Time costs of a query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-23
Memory-activity costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-24
Sort-time costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-24
Row-reading costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-25
Sequential access costs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-26
Nonsequential access costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-26
Index lookup costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-26
In-place ALTER TABLE costs . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-27
View costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-27
Small-table costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-28
Data-mismatch costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-28
Encrypted-value costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-29
GLS functionality costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-29
Network-access costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-29
Optimization when SQL is within an SPL routine. . . . . . . . . . . . . . . . . . . . . . 10-31
SQL optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-31
Execution of an SPL routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-33
SPL routine executable format stored in UDR cache . . . . . . . . . . . . . . . . . . . . 10-33
Trigger execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-34
Performance implications for triggers . . . . . . . . . . . . . . . . . . . . . . . . . 10-35
Contents ix
Optimization goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-35
Optimize queries for user-defined data types . . . . . . . . . . . . . . . . . . . . . . . 13-38
Parallel UDRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-38
Selectivity and cost functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-39
User-defined statistics for UDTs . . . . . . . . . . . . . . . . . . . . . . . . . . 13-40
Negator functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-40
Optimize queries with the SQL statement cache . . . . . . . . . . . . . . . . . . . . . . 13-40
When to use the SQL statement cache . . . . . . . . . . . . . . . . . . . . . . . . 13-41
Using the SQL statement cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-41
Monitoring memory usage for each session . . . . . . . . . . . . . . . . . . . . . . . 13-43
Monitoring usage of the SQL statement cache . . . . . . . . . . . . . . . . . . . . . . 13-46
Monitor sessions and threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-47
Monitor sessions and threads with onstat commands . . . . . . . . . . . . . . . . . . . 13-48
Monitor sessions and threads with ON-Monitor (UNIX) . . . . . . . . . . . . . . . . . . 13-53
Monitor sessions and threads with SMI tables . . . . . . . . . . . . . . . . . . . . . . 13-54
Monitor transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-55
Display information about transactions . . . . . . . . . . . . . . . . . . . . . . . . 13-55
Display information about transaction locks . . . . . . . . . . . . . . . . . . . . . . 13-56
Display statistics on user sessions . . . . . . . . . . . . . . . . . . . . . . . . . . 13-57
Display statistics on sessions executing SQL statements. . . . . . . . . . . . . . . . . . . 13-58
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
Privacy policy considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
Contents xi
xii IBM Informix Performance Guide
Introduction
This introduction provides an overview of the information in this publication and
describes the conventions it uses.
Information in this publication can help you perform the following tasks:
v Monitor system resources that are critical to performance
v Identify database activities that affect these critical resources
v Identify and monitor queries that are critical to performance
v Use the database server utilities (especially onperf, ISA and onstat) for
performance monitoring and tuning
v Eliminate performance bottlenecks by:
– Balancing the load on system resources
– Adjusting the configuration parameters or environment variables of your
database server
– Adjusting the arrangement of your data
– Allocating resources for decision-support queries
– Creating indexes to speed up retrieval of your data
Types of users
This publication is written for the following users:
v Database administrators
v Database server administrators
v Database-application programmers
v Performance engineers
Software dependencies
This publication assumes that you are using IBM Informix, Version 11.70.
The IBM Informix OLE DB Provider follows the ISO string formats for date, time,
and money, as defined by the Microsoft OLE DB standards. You can override that
default by setting an Informix environment variable or registry entry, such as
DBDATE.
The examples in this publication are written with the assumption that you are
using one of these locales: en_us.8859-1 (ISO 8859-1) on UNIX platforms or
en_us.1252 (Microsoft 1252) in Windows environments. These locales support U.S.
English format conventions for displaying and entering date, time, number, and
currency values. They also support the ISO 8859-1 code set (on UNIX and Linux)
You can specify another locale if you plan to use characters from other locales in
your data or your SQL identifiers, or if you want to conform to other collation
rules for character data.
For instructions about how to specify locales, additional syntax, and other
considerations related to GLS locales, see the IBM Informix GLS User's Guide.
Demonstration databases
The DB-Access utility, which is provided with your IBM Informix database server
products, includes one or more of the following demonstration databases:
v The stores_demo database illustrates a relational schema with information about
a fictitious wholesale sporting-goods distributor. Many examples in IBM
Informix publications are based on the stores_demo database.
v The superstores_demo database illustrates an object-relational schema. The
superstores_demo database contains examples of extended data types, type and
table inheritance, and user-defined routines.
For information about how to create and populate the demonstration databases,
see the IBM Informix DB-Access User's Guide. For descriptions of the databases and
their contents, see the IBM Informix Guide to SQL: Reference.
The scripts that you use to install the demonstration databases are in the
$INFORMIXDIR/bin directory on UNIX platforms and in the %INFORMIXDIR%\bin
directory in Windows environments.
The following changes and enhancements are relevant to this publication. For a
complete list of what's new in this release, see the release notes or the information
center at http://pic.dhe.ibm.com/infocenter/idshelp/v117/topic/com.ibm.po.doc/
new_features.htm.
Introduction xv
Table 1. What's new in the IBM Informix Performance Guide for version 11.70.xC8
Overview Reference
Dynamic private memory caches for CPU virtual “CPU virtual processor private memory caches” on page
processors 3-24
Introduction xvii
Table 2. What's new in the IBM Informix Performance Guide for version 11.70xC4
Overview Reference
Data sampling for update statistics operations “Data sampling during update statistics operations” on
page 13-18
If you have a large index with more than 100 000 leaf
pages, you can generate index statistics based on
sampling when you run UPDATE STATISTICS statements
in LOW mode. Gathering index statistics from sampled
data can increase the speed of the update statistics
operations. To enable sampling, set the
USTLOW_SAMPLE configuration parameter or the
USTLOW_SAMPLE option of the SET ENVIRONMENT
statement.
Table 3. What's new in the IBM Informix Performance Guide for version 11.70xC3
Overview Reference
Automatic read-ahead operations “Configuration parameters that affect table I/O” on page
5-28
You can enable the database server to use read-ahead
operations automatically to improve performance. Most AUTO_READAHEAD configuration parameter
queries can benefit from processing the query while (Administrator's Reference)
asynchronously retrieving the data required by the query.
The database server can automatically use asynchronous
operations for data or it can avoid them if the data for
the query is already cached. Use the
AUTO_READAHEAD configuration parameter to
configure automatic read-ahead operations for all queries,
and use the SET ENVIRONMENT AUTO_READAHEAD
statement to configure automatic read-ahead operations
for a particular session.
Table 5. What's new in the IBM Informix Performance Guide for version 11.70xC1
Overview Reference
Less root node contention with forest of trees indexes “Forest of trees indexes” on page 7-2
If you have many concurrent users who routinely “Improve query performance with a forest of trees index”
experience delays due to root node contention, you might on page 7-13
improve query performance if you convert your B-tree
index to a forest of trees index. A forest of trees index is CREATE INDEX statement (SQL Syntax)
similar to a B-tree index, but has multiple root nodes and
potentially fewer levels. You create forest of trees indexes HASH ON clause (SQL Syntax)
with the new HASH ON clause of the CREATE INDEX
statement of SQL.
Automatic light scans on tables “Light scans” on page 5-27
Introduction xix
Table 5. What's new in the IBM Informix Performance Guide for version 11.70xC1 (continued)
Overview Reference
Automatically add CPU virtual processors “Automatic addition of CPU virtual processors” on page
3-20
When the database server starts, it checks that the
number of CPU virtual processors is at least half the
number of CPU processors on the database server
computer. This ratio of CPU processors to CPU virtual
processors is a recommended minimum to ensure that the
database server performs optimally in most situations. If
necessary, the database server automatically increases the
number of CPU virtual processors to half the number of
CPU processors.
Automatically optimize data storage “Defragment partitions to merge extents” on page 6-28
Large pages for non-message shared memory segments “Virtual portion of shared memory” on page 4-2
that reside in physical memory are now enabled by
default on Linux platforms. Previously, large pages were
supported only on AIX® and Solaris systems. The use of
large pages can provide performance benefits in large
memory configurations. To enable or disable support for
large pages, use the IFX_LARGE_PAGES environment
variable.
Improving performance by reducing buffer reads “Improve performance by adding or removing indexes”
on page 13-20
If you enable the new BATCHEDREAD_INDEX
configuration parameter, the optimizer automatically BATCHEDREAD_INDEX configuration parameter
chooses to fetch a set of keys from an index buffer, (Administrator's Reference)
reducing the number of buffer times a buffer is read.
Alerts for tables with in-place alter operations “In-place alter” on page 6-35
If only SQL statements are listed in the example, they are not delimited by
semicolons. For instance, you might see the code in the following example:
CONNECT TO stores_demo
...
Introduction xxi
...
COMMIT WORK
DISCONNECT CURRENT
To use this SQL code for a specific product, you must apply the syntax rules for
that product. For example, if you are using an SQL API, you must use EXEC SQL
at the start of each statement and a semicolon (or other appropriate delimiter) at
the end of the statement. If you are using DB–Access, you must delimit multiple
statements with semicolons.
Tip: Ellipsis points in a code example indicate that more code would be added in
a full application, but it is not necessary to show it to describe the concept that is
being discussed.
Additional documentation
Documentation about this release of IBM Informix products is available in various
formats.
IBM Informix SQL-based products are fully compliant with SQL-92 Entry Level
(published as ANSI X3.135-1992), which is identical to ISO 9075:1992. In addition,
many features of IBM Informix database servers comply with the SQL-92
Intermediate and Full Level and X/Open SQL Common Applications Environment
(CAE) standards.
Feedback from all methods is monitored by the team that maintains the user
documentation. The feedback methods are reserved for reporting errors and
Introduction xxiii
xxiv IBM Informix Performance Guide
Chapter 1. Performance basics
Performance measurement and tuning issues and methods are relevant to daily
database server administration and query execution.
These topics:
v Describe a basic approach for performance measurement and tuning
v Provide guidelines for a quick start to obtain acceptable initial performance on a
small database
v Describe roles in maintaining good performance
By recognizing problems early, you can prevent them from affecting users
significantly. Early indications of a performance problem are often vague; users
might report that the system seems sluggish. Users might complain that they
cannot get all their work done, that transactions take too long to complete, that
queries take too long to process, or that the application slows down at certain
times during the day.
To determine the nature of the problem, you must measure the actual use of
system resources and evaluate the results.
To optimize performance:
1. Establish performance objectives.
2. Take regular measurements of resource utilization and database activity.
3. Identify symptoms of performance problems: disproportionate utilization of
CPU, memory, or disks.
Performance goals
When you plan for measuring and tuning performance, you should consider
performance goals and determine which goals are the most important.
Many considerations go into establishing performance goals for the database server
and the applications that it supports. Be clear and consistent about articulating
performance goals and priorities, so that you can provide realistic and consistent
The answers to these questions can help you set realistic performance goals for
your resources and your mix of applications.
Measurements of performance
You can use throughput, response time, cost per transaction, and resource
utilization measures to evaluate performance.
Throughput, response time, and cost per transaction are described in the topics
that follow.
Resource utilization can have one of two meanings, depending on the context. The
term can refer to the amount of a resource that a particular operation requires or
uses, or it can refer to the current load on a particular system component. The term
is used in the former sense to compare approaches for accomplishing a given task.
For instance, if a given sort operation requires 10 megabytes of disk space, its
resource utilization is greater than another sort operation that requires only 5
megabytes of disk space. The term is used in the latter sense to refer, for instance,
to the number of CPU cycles that are devoted to a particular query during a
specific time interval.
For a discussion about the performance impact of different load levels on various
system components, see “Resource utilization and performance” on page 1-7.
Throughput
Throughput measures the overall performance of the system. For transaction
processing systems, throughput is typically measured in transactions per second
(TPS) or transactions per minute (TPM).
If you need more immediate feedback, you can use onstat -p to gather an estimate.
You can use the SET LOG statement to set the logging mode to unbuffered for the
databases that contain tables of interest. You can also use the trusted auditing
facility in the database server to record successful COMMIT WORK events or other
events of interest in an audit log file. Using the auditing facility can increase the
overhead involved in processing any audited event, which can reduce overall
throughput.
Related information:
Auditing data security (Security Guide)
Because every database application has its own particular workload, you cannot
use TPC benchmarks to predict the throughput for your application. The actual
throughput that you achieve depends largely on your application.
The response time for a typical Informix application includes the following
sequence of actions. Each action requires a certain amount of time. The response
time does not include the time that it takes for the user to think of and enter a
query or request:
1. The application forwards a query to the database server.
2. The database server performs query optimization and retrieves any
user-defined routines (UDRs). UDRs include both SPL routines and external
routines.
3. The database server retrieves, adds, or updates the appropriate records and
performs disk I/O operations directly related to the query.
4. The database server performs any background I/O operations, such as logging
and page cleaning, that occur during the period in which the query or
transaction is still pending.
5. The database server returns a result to the application.
6. The application displays the information or issues a confirmation and then
issues a new prompt to the user.
Figure 1-1 contains a diagram that shows how the actions just described in steps 1
through 6 contribute to the overall response time.
custno custname
SELECT*in 1234 XYZLTD
Database Background 1235 XSPORTS
However, you can decrease the response time for a specific query, at the expense of
overall throughput, by allocating a disproportionate amount of resources to that
query. Conversely, you can maintain overall throughput by restricting the resources
that the database allocates to a large query.
Response-time measurement
To measure the response time for a query or application, you can use the timing
commands and performance monitoring and timing functions that your operating
system provides.
Your operating system typically has a utility that you can use to time a command.
You can often use this timing utility to measure the response times to SQL
statements that a DB-Access command file issues.
UNIX Only
If you have a command file that performs a standard set of SQL
statements, you can use the time command on many systems to obtain an
accurate timing for those commands.
The following example shows the output of the UNIX time command:
time commands.dba
...
4.3 real 1.5 user 1.3 sys
The time output lists the amount of elapsed time (real), the user CPU time,
and the system CPU time. If you use the C shell, the first three columns of
output from the C shell time command show the user, system, and elapsed
times, respectively. In general, an application often performs poorly when
the proportion of system CPU time exceeds one-third of the total elapsed
time.
The time command gathers timing information about your application. You
can use this command to invoke an instance of your application, perform a
database operation, and then exit to obtain timing figures, as the following
example illustrates:
time sqlapp
(enter SQL command through sqlapp, then exit)
10.1 real 6.4 user 3.7 sys
You can use a script to run the same test repeatedly, which allows you to
obtain comparable results under different conditions. You can also obtain
estimates of your average response time by dividing the elapsed time for
the script by the number of database operations that the script performs.
Operating systems usually have a performance monitor that you can use to
measure response time for a query or process.
Windows Only
You can often use the Performance Logs and Alerts that the Windows
operating system supplies to measure the following times:
v User time
v Processor time
Most programming languages have a library function for the time of day. If you
have access to the source code, you can insert pairs of calls to this function to
measure the elapsed time between specific actions.
ESQL/C Only
For example, if the application is written in IBM Informix ESQL/C, you
can use the dtcurrent() function to obtain the current time. To measure
response time, you can call dtcurrent() to report the time at the start of a
transaction and again to report the time when the transaction commits.
Although this measure is useful for planning and evaluation, it is seldom relevant
to the daily issues of achieving optimum performance.
You must take regular measurements of the workload and performance of your
system to predict peak loads and compare performance measurements at different
points in your usage cycle. Regular measurements help you to develop an overall
performance profile for your database server applications. This profile is critical in
determining how to improve performance reliably.
How you measure resource utilization depends on the tools that your operating
system provides for reporting system activity and resource utilization. After you
identify a resource that seems overused, you can use the performance-monitoring
utilities that the database server provides to gather data and make inferences about
the database activities that might account for the load on that component. You can
adjust your database server configuration or your operating system to reduce those
database activities or spread them among other components. In some cases, you
might need to provide additional hardware resources to resolve a performance
bottleneck.
Resource utilization
Whenever a system resource, such as a CPU or a particular disk, is occupied by a
transaction or query, the resource is unavailable for processing other requests.
Pending requests must wait for the resources to become available before they can
complete.
When a component is too busy to keep up with all its requests, the overused
component becomes a bottleneck in the flow of activity. The higher the percentage
of time that the resource is occupied, the longer each operation must wait for its
turn.
You can use the following formula to estimate the service time for a request based
on the overall utilization of the component that services the request. The expected
service time includes the time that is spent both waiting for and using the resource
in question. Think of service time as that portion of the response time accounted
for by a single component within your computer, as the following formula shows:
S= P/(1-U)
S is the expected service time.
P is the processing time that the operation requires after it obtains the
resource.
U is the utilization for the resource (expressed as a decimal).
Elapsed 12
time (as a 10
multiple of 8
processing 6
time) in 4
minutes
2
0
0 10 20 30 40 50 60 70 80 90 100
Resource utilization (%)
Figure 1-2. Service Time for a Single Component as a Function of Resource Utilization
If the average response time for a typical transaction soars from 2 or 3 seconds to
10 seconds or more, users are certain to notice and complain.
When you consider resource utilization, also consider whether increasing the page
size of a standard or temporary dbspace is beneficial in your environment. If you
want a longer key length than is available for the default page size of a standard
or temporary dbspace, you can increase the page size.
CPU utilization
Estimates of CPU utilization and response time can help you determine if you
need to eliminate or reschedule some activities.
You can use the resource-utilization formula in the previous topic (“Resource
utilization” on page 1-8) to estimate the response time for a heavily loaded CPU.
However, high utilization for the CPU does not always indicate a performance
problem. The CPU performs all calculations that are needed to process
transactions. The more transaction-related calculations that it performs within a
given period, the higher the throughput will be for that period. As long as
transaction throughput is high and seems to remain proportional to CPU
utilization, a high CPU utilization indicates that the computer is being used to the
fullest advantage.
On the other hand, when CPU utilization is high but transaction throughput does
not keep pace, the CPU is either processing transactions inefficiently or it is
engaged in activity not directly related to transaction processing. CPU cycles are
being diverted to internal housekeeping tasks such as memory management.
If the response time for transactions increases to such an extent that delays become
unacceptable, the processor might be swamped; the transaction load might be too
high for the computer to manage. Slow response time can also indicate that the
CPU is processing transactions inefficiently or that CPU cycles are being diverted.
When CPU utilization is high, a detailed analysis of the activities that the database
server performs can reveal any sources of inefficiency that might be present due to
improper configuration. For information about analyzing database server activity,
see “Database server tools” on page 2-3.
Memory utilization
Memory is not managed as a single component, such as a CPU or disk, but as a
collection of small components called pages.
The size of a typical page in memory can range from 1 to 8 kilobytes, depending
on your operating system. A computer with 64 megabytes of memory and a page
size of 2 kilobytes contains approximately 32,000 pages.
When the operating system needs to allocate memory for use by a process, it
scavenges any unused pages within memory that it can find. If no free pages exist,
the memory-management system has to choose pages that other processes are still
using and that seem least likely to be needed in the short run. CPU cycles are
required to select those pages. The process of locating such pages is called a page
scan. CPU utilization increases when a page scan is required.
Eventually, page images that have been copied to the swap disk must be brought
back in for use by the processes that require them. If there are still too few free
pages, more must be paged out to make room. As memory comes under increasing
demand and paging activity increases, this activity can reach a point at which the
CPU is almost fully occupied with paging activity. A system in this condition is
said to be thrashing. When a computer is thrashing, all useful work comes to a halt.
Although the principle for estimating the service time for memory is the same as
that described in “Resource utilization and performance” on page 1-7, you use a
different formula to estimate the performance impact of memory utilization than
you do for other system components.
You can use the following formula to calculate the expected paging delay for a
given CPU utilization level and paging rate:
PD= (C/(1-U)) * R * T
PD is the paging delay.
C is the CPU service time for a transaction.
U is the CPU utilization (expressed as a decimal).
R is the paging-out rate.
T is the service time for the swap device.
As paging increases, CPU utilization also increases, and these increases are
compounded. If a paging rate of 10 per second accounts for 5 percent of CPU
utilization, increasing the paging rate to 20 per second might increase CPU
utilization by an additional 5 percent. Further increases in paging lead to even
sharper increases in CPU utilization, until the expected service time for CPU
requests becomes unacceptable.
Disk utilization
Because transfer rates vary among disks, most operating systems do not report
disk utilization directly. Instead, they report the number of data transfers per
second (in operating-system memory-page-size units.)
Because each disk acts as a single resource, you can use the following basic
formula to estimate the service time, which is described in detail in “Resource
utilization” on page 1-8:
S= P/(1-U)
To compare the load on disks with similar access times, simply compare the
average number of transfers per second.
If you know the access time for a given disk, you can use the number of transfers
per second that the operating system reports to calculate utilization for the disk. To
do so, multiply the average number of transfers per second by the access time for
the disk as listed by the disk manufacturer. Depending on how your data is laid
The following example shows how to calculate the utilization for a disk with a
30-millisecond access time and an average of 10 transfer requests per second:
U = (A * 1.2) * X
= (.03 * 1.2) * 10
= .36
U is the resource utilization (this time of a disk).
A is the access time (in seconds) that the manufacturer lists.
X is the number of transfers per second that your operating system reports.
You can use the utilization to estimate the processing time at the disk for a
transaction that requires a given number of disk transfers. To calculate the
processing time at the disk, multiply the number of disk transfers by the average
access time. Include an extra 20 percent to account for access-time variability:
P = D (A * 1.2)
P is the processing time at the disk.
D is the number of disk transfers.
A is the access time (in seconds) that the manufacturer lists.
For example, you can calculate the processing time for a transaction that requires
20 disk transfers from a 30-millisecond disk as follows:
P = 20 (.03 * 1.2)
= 20 * .036
= .72
Use the processing time and utilization values that you calculated to estimate the
expected service time for I/O at the particular disk, as the following example
shows:
S = P/(1-U)
= .72 / (1 - .36)
= .72 / .64
= 1.13
You must consider these factors when you attempt to identify performance
problems or make adjustments to your system:
v Hardware resources
As discussed earlier in this chapter, hardware resources include the CPU,
physical memory, and disk I/O subsystems.
v Operating-system configuration
The database server depends on the operating system to provide low-level
access to devices, process scheduling, interprocess communication, and other
vital services.
The database server administrator usually coordinates the activities of all users to
ensure that system performance meets overall expectations. For example, the
operating-system administrator might need to reconfigure the operating system to
increase the amount of shared memory. Bringing down the operating system to
install the new configuration requires bringing the database server down. The
database server administrator must schedule this downtime and notify all affected
users when the system will be unavailable.
This chapter also contains cross-references to topics that about how to interpret the
results of performance monitoring
The kinds of data that you need to collect depend on the kinds of applications that
you run on your system. The causes of performance problems on OLTP (online
transaction processing) systems are different from the causes of problems on
systems that are used primarily for DSS query applications. Systems with mixed
use provide a greater performance-tuning challenge and require a sophisticated
analysis of performance-problem causes.
To alter certain database server characteristics, you must bring down the database
server, which can affect your production system. Some configuration adjustments
can unintentionally decrease performance or cause other negative side effects.
When performance problems relate to backup operations, you might also examine
the number or transfer rates for tape drives. You might need to alter the layout or
fragmentation of your tables to reduce the impact of backup operations. For
information about disk layout and table fragmentation, see Chapter 6, “Table
performance considerations,” on page 6-1 and Chapter 7, “Indexes and index
performance considerations,” on page 7-1.
Determine whether you want to set the configuration parameters that help
maintain server performance by automatically adjusting properties of the database
server while it is running, for example:
v AUTO_AIOVPS: Adds AIO virtual processors when I/O workload increases.
v AUTO_CKPTS: Increases the frequency of checkpoints to avoid transaction
blocking.
If a history is not available, you must start tracking performance after a problem
arises, and you might not be able to tell when and how the problem began. Trying
to identify problems after the fact significantly delays resolution of a performance
problem.
To build a performance history and profile of your system, take regular snapshots
of resource-utilization information.
For example, if you chart the CPU utilization, paging-out rate, and the I/O transfer
rates for the various disks on your system, you can begin to identify peak-use
levels, peak-use intervals, and heavily loaded resources.
If you monitor fragment use, you can determine whether your fragmentation
scheme is correctly configured. Monitor other resource use as appropriate for your
database server configuration and the applications that run on it.
You also use performance monitoring tools with a graphical interface to monitor
critical aspects of performance as queries and transactions are performed.
Operating-system tools
The database server relies on the operating system of the host computer to provide
access to system resources such as the CPU, memory, and various unbuffered disk
I/O interfaces and files. Each operating system has its own set of utilities for
reporting how system resources are used.
UNIX Only
The following table lists some UNIX utilities that monitor system resources.
To capture the status of system resources at regular intervals, use scheduling tools
that are available with your host operating system (for example, cron) as part of
your performance monitoring system.
Windows Only
You can often use the Performance Logs and Alerts that the Windows operating
system supplies to monitor resources such as processor, memory, cache, threads,
and processes. The Performance Logs and Alerts also provide charts, alerts,
reports, and the ability to save information to log files for later analysis.
For more information about how to use the Performance Logs and Alerts, consult
your operating-system manuals.
The database server tools and utilities that you can use for performance
monitoring include:
v IBM OpenAdmin Tool (OAT) for Informix
v The onstat utility
v The onlog utility
v The oncheck utility
v The ON-Monitor utility (on UNIX only)
v The onperf utility (on UNIX only)
v DB-Access and the system-monitoring interface (SMI), which you can use to
monitor performance from within your application
v SQL administration API commands
You can use onstat, onlog, or oncheck commands invoked by the cron scheduling
facility to capture performance-related information at regular intervals and build a
historical performance profile of your database server application. The following
sections describe these utilities.
You can use SQL SELECT statements to query the system-monitoring interface
(SMI) from within your application.
The SMI tables are a collection of tables and pseudo-tables in the sysmaster
database that contain dynamically updated information about the operation of the
database server. The database server constructs these tables in memory but does
not record them on disk. The onstat utility options obtain information from these
SMI tables.
You can use cron and SQL scripts with DB-Access or onstat utility options to
query SMI tables at regular intervals.
Tip: The SMI tables are different from the system catalog tables. System catalog
tables contain permanently stored and updated information about each database
and its tables (sometimes referred to as metadata or a data dictionary).
You can use ON-Monitor to check the current database server configuration.
You can use onperf to display database server activity with the Motif window
manager.
Related concepts:
The onlog utility (Administrator's Reference)
The oncheck Utility (Administrator's Reference)
DB-Access User's Guide (DB-Access Guide)
System catalog tables (SQL Reference)
Chapter 14, “The onperf utility on UNIX,” on page 14-1
Related reference:
The database server CD-ROM distributed with your product includes ISA. For
information on how to install ISA, see the following file on the CD-ROM.
Table 2-1. Operating system file
Operating System File
UNIX /SVR_ADM/README
Windows \SVR_ADM\readme.txt
With ISA, you can use a browser to perform these common database server
administrative tasks:
v Change configuration parameters temporarily or permanently
v Change the database server mode between online and offline and its
intermediate states
v Modify connectivity information in the sqlhosts file
v Check dbspaces, sbspaces, logs, and other objects
v Manage logical and physical logs
v Examine memory use and adding and freeing memory segments
v Read the message log
v Back up and restore dbspaces and sbspaces
v Run various onstat commands to monitor performance
v Enter simple SQL statements and examine database schemas
v Add and remove chunks, dbspaces, and sbspaces
v Examine and manage user sessions
v Examine and manage virtual processors (VPs)
v Use the High-Performance Loader (HPL), dbimport, and dbexport
v Manage Enterprise Replication
v Manage an IBM Informix MaxConnect server
v Use the following utilities: dbaccess, dbschema, onbar, oncheck, ondblog,
oninit, onlog, onmode, onparams, onspaces, and onstat
You also can enter any Informix utility, UNIX shell command, or Windows
command (for example, oncheck -cd; ls -l).
For a complete list of all onstat options, use the onstat - - command. For a
complete display of all the information that onstat gathers, use the onstat -a
command.
The following table lists some of the onstat commands that display general
performance-related information.
Table 2-2. onstat commands that display performance information
onstat command Description
onstat -p Displays a performance profile that includes
the number of reads and writes, the number
of times that a resource was requested but
was not available, and other miscellaneous
information
onstat -b Displays information about buffers currently
in use
onstat -l Displays information about the physical and
logical logs
onstat -x Displays information about transactions,
including the thread identifier of the user
who owns the transaction
onstat -u Displays a user activity profile that provides
information about user threads including the
thread owner's session ID and login name
onstat -R Displays information about buffer pools,
including information about buffer pool page
size.
onstat -F Displays page-cleaning statistics that include
the number of writes of each type that
flushes pages to disk
onstat -g Requires an additional argument that
specifies the information to be displayed
One of the most useful commands for monitoring system resources is onstat -g
and its many options.
Related reference:
onstat -g monitoring options (Administrator's Reference)
Use the following onstat -g command arguments to monitor disk I/O utilization.
For a detailed case study that uses various onstat outputs, see Appendix A, “Case
studies and examples,” on page A-1.
The oncheck utility provides the following options and information that apply to
contiguous space and extents.
Option Information
-pB Blobspace simple large object (TEXT or BYTE data)
For information about how to use this option to determine the efficiency of
blobpage size, see “Determine blobpage fullness with oncheck -pB output”
on page 5-18.
-pe Chunks and extents
For information about how to use this option to monitor extents, see
“Checking for extent interleaving” on page 6-24 and “Eliminating interleaved
extents” on page 6-24.
-pk Index key values.
For information about how to improve the performance of this option, see
“Improving performance for index checks” on page 7-20.
-pK Index keys and row IDs
For information about how to improve the performance of this option, see
“Improving performance for index checks” on page 7-20.
-pl Index-leaf key values
For information about how to improve the performance of this option, see
“Improving performance for index checks” on page 7-20.
-pL Index-leaf key values and row IDs
For information about how to improve the performance of this option, see
“Improving performance for index checks” on page 7-20.
-pp Pages by table or fragment
For information about how to use this option to monitor space, see
“Considering the upper limit on extents” on page 6-24.
For information about how to use this option to monitor extents, see
“Considering the upper limit on extents” on page 6-24.
-pr Root reserved pages
For information about how to use this option, see “Estimating tables with
fixed-length rows” on page 6-5.
-ps Space used by smart large objects and metadata in sbspace.
-pS Space used by smart large objects and metadata in sbspace and storage
characteristics
For information about how to use this option to monitor space, see
“Monitoring sbspaces” on page 6-13.
-pt Space used by table or fragment
For information about how to use this option to monitor space, see
“Estimating table size” on page 6-5.
-pT Space used by table, including indexes
For information about how to use this option to monitor space, see
“Performance of in-place alters for DDL operations” on page 6-40.
For more information about using oncheck to monitor space, see “Estimating table
size” on page 6-5. For more information about concurrency during oncheck
execution, see “Improving performance for index checks” on page 7-20.
Related concepts:
The oncheck Utility (Administrator's Reference)
Monitor transactions
You can use the onlog and onstat utilities to monitor transactions.
This onlog utility can take input from selected log files, the entire logical log, or a
backup tape of previous log files.
Use onlog with caution when you read logical-log files still on disk, because
attempting to read unreleased log files stops other database activity. For greatest
safety, back up the logical-log files first and then read the contents of the backup
files. With proper care, you can use the onlog -n option to restrict onlog only to
logical-log files that have been released.
Related reference:
The onstat utility (Administrator's Reference)
To monitor database server activity, you can view the number of active sessions
and the amount of resources that they are using.
Use the following command arguments to get memory information for each
session.
For more information, see “Display the query plan” on page 13-1.
Multiple database server instances that run on the same host computer perform
poorly when compared with a single database server instance that manages
multiple databases. Multiple database server instances cannot balance their loads
as effectively as a single database server. Avoid multiple residency for production
environments in which performance is critical.
Each instance of the database server requires the following semaphore sets:
v One set for each group of up to 100 virtual processors (VPs) that are started with
the database server
v One set for each additional VP that you might add dynamically while the
database server is running
v One set for each group of 100 or fewer user sessions connected through the
shared-memory communication interface
Tip: For best performance, allocate enough semaphores for double the number of
ipcshm connections that you expect. Use the NETTYPE configuration parameter to
configure database server poll threads for this doubled number of connections.
Some operating systems require that you configure a maximum total number of
semaphores across all sets, which the SEMMNS operating-system configuration
parameter typically specifies. Use the following formula to calculate the total
number of semaphores that each instance of the database server requires:
SEMMNS = init_vps + added_vps + (2 * shmem_users) + concurrent_utils
init_vps
is the number of virtual processors (VPs) that are started with the database
server. This number includes CPU, PIO, LIO, AIO, SHM, TLI, SOC, and
ADM VPs. The minimum value is 15.
added_vps
is the number of VPs that you intend to add dynamically.
shmem_users
is the number of shared-memory connections that you allow for this
instance of the database server.
concurrent_utils
is the number of concurrent database server utilities that can connect to
this instance. It is suggested that you allow for a minimum of six utility
connections: two for ON-Archive or ON-Bar and four for other utilities
such as onstat, and oncheck.
If you use software packages that require semaphores, the SEMMNI configuration
parameter must include the total number of semaphore sets that the database
server and your other software packages require. You must set the SEMMSL
configuration parameter to the largest number of semaphores per set that any of
your software packages require. For systems that require the SEMMNS
configuration parameter, multiply SEMMNI by the value of SEMMSL to calculate
an acceptable value.
Related concepts:
“Configuring poll threads” on page 3-13
The number of open file descriptors that each instance of the database server needs
depends on the number of chunks in your database, the number of VPs that you
run, and the number of network connections that your database server instance
must support.
Use the following formula to calculate the number of file descriptors that your
instance of the database server requires:
NFILES = (chunks * NUMAIOVPS) + NUMBER_of_CPU_VPS + net_connections
chunks is the number of chunks to be configured.
net_connections
is the number of network connections that you specify in either of the
following places:
Each open file descriptor is about the same length as an integer within the kernel.
Allocating extra file descriptors is an inexpensive way to allow for growth in the
number of chunks or connections on your system.
Insufficient physical memory for the overall system load can lead to thrashing, as
“Memory utilization” on page 1-10 describes. Insufficient memory for the database
server can result in excessive buffer-management activity. For more information
about configuring memory, see “Configuring UNIX shared memory” on page 4-6.
Informix runs in the background. For best performance, give the same priority to
foreground and background applications.
The configuration of memory in the operating system can impact other resources,
including CPU and I/O. Insufficient physical memory for the overall system load
can lead to thrashing, as “Memory utilization” on page 1-10 describes. Insufficient
memory for Informix can result in excessive buffer-management activity. When you
set the Virtual Memory values in the System icon on the Control Panel, ensure
that you have enough paging space for the total amount of physical memory.
A client can also use the SET PDQPRIORITY statement in SQL to set a value for
PDQ priority. The actual percentage allocated to any query is subject to the factor
that the MAX_PDQPRIORITY configuration parameter sets. For more information
about how to limit resources that can be allocated to a query, see “Limiting PDQ
resources in queries” on page 3-11.
To execute user-defined routines (UDRs), you can define a new class of virtual
processors to isolate UDR execution from other transactions that execute on the
CPU virtual processors. Typically you write user-defined routines to support
user-defined data types.
If you do not want a user-defined routine to affect the normal processing of user
queries in the CPU class, you can use the CREATE FUNCTION statement to assign
the routine to a user-defined class of virtual processors. The class name that you
specify in the VPCLASS configuration parameter must match the name specified in
the CLASS modifier of the CREATE FUNCTION statement.
When the database server starts, the number of CPU VPs is automatically
increased to half the number of CPU processors on the database server computer,
unless the SINGLE_CPU_VP configuration parameter is enabled. However, you
might want to change the number of CPU VPs based on your performance needs.
When you use this setting, one processor is available to run the database
server utilities or the client application.
Multiprocessor computers that are not primarily database servers
For multiprocessor systems that you do not use primarily to support
database servers, you can start with somewhat fewer CPU VPs to allow for
other activities on the system and then gradually add more if necessary.
Multi-core or hardware multithreading computers with logical CPUs
For multiprocessor systems that use multi-core processors or hardware
multithreading to support more logical CPUs than physical processors, you
can assign the number of CPU VPs according to the number of logical
CPU VPs available for that purpose. The amount of processing that an
additional logical CPU can provide might be only a fraction of what a
dedicated physical processor can support.
On systems, where multi-core processors are installed, the optimal
configuration in most cases is the same as for systems with a number of
Your database server distribution includes a machine notes file that contains
information about whether your version of the database server supports this
feature. For information about where to find this machine notes file, see the
Introduction to this guide.
Specify the noage option of VPCLASS if your operating system supports this
feature.
Related reference:
VPCLASS configuration parameter (Administrator's Reference)
You can use processor affinity for the purposes that the following sections describe.
Related reference:
VPCLASS configuration parameter (Administrator's Reference)
You can use processor affinity to distribute the computation impact of CPU virtual
processors (VPs) and other processes. On computers that are dedicated to the
database server, assigning CPU VPs to all but one of the CPUs achieves maximum
CPU utilization.
On computers that support both database server and client applications, you can
bind applications to certain CPUs through the operating system. By doing so, you
effectively reserve the remaining CPUs for use by database server CPU VPs, which
you bind to the remaining CPUs with the VPCLASS configuration parameter. Set
the aff option of the VPCLASS configuration parameter to the numbers of the
CPUs on which to bind CPU VPs. For example, the following VPCLASS setting
assigns CPU VPs to processors 4 to 7:
VPCLASS cpu,num=4,aff=(4-7)
When specifying a range of processors, you can also specify an incremental value
with the range that indicates which CPUs in the range should be assigned to the
virtual processors. For example, you can specify that the virtual processors are
assigned to every other CPU in the range 0-6, starting with CPU 0.
VPCLASS CPU,num=4,aff=(0-6/2)
When you specify more than one value or range, the values and ranges do not
have to be incremental or in any particular order. For example you can specify
aff=(8,12,7-9,0-6/2).
The database server assigns CPU virtual processors to CPUs in a circular pattern,
starting with the first processor number that you specify in the aff option. If you
specify a larger number of CPU virtual processors than physical CPUs, the
database server continues to assign CPU virtual processors starting with the first
CPU. For example, suppose you specify the following VPCLASS settings:
VPCLASS cpu,num=8,aff=(4-7)
On a system that runs database server and client (or other) applications, you can
bind asynchronous I/O (AIO) VPs to the same CPUs to which you bind other
application processes through the operating system. In this way, you isolate client
applications and database I/O operations from the CPU VPs.
This isolation can be especially helpful when client processes are used for data
entry or other operations that require waiting for user input. Because AIO VP
activity usually comes in quick bursts followed by idle periods waiting for the
disk, you can often interweave client and I/O operations without their unduly
impacting each other.
Binding a CPU VP to a processor does not prevent other processes from running
on that processor. Application (or other) processes that you do not bind to a CPU
are free to run on any available processor. On a computer that is dedicated to the
database server, you can leave AIO VPs free to run on any processor, which
reduces delays on database operations that are waiting for I/O. Increasing the
priority of AIO VPs can further improve performance by ensuring that data is
processed quickly once it arrives from disk.
The database server assigns CPU VPs to CPUs serially, starting with the CPU
number you specify in this parameter. You might want to avoid assigning CPU
VPs to a certain CPU that has a specialized hardware or operating-system function
(such as interrupt handling).
If your operating system does not support kernel asynchronous I/O (KAIO), the
database server uses AIO virtual processors (VPs) to manage all database I/O
requests.
If the VPCLASS configuration parameter does not specify the number of AIO VPs
to start in the onconfig file, then the setting of the AUTO_AIOVPS configuration
parameter controls the number of AIO VPs:
v If AUTO_AIOVPS is set to 1 (on), the number of AIO VPs initially started is
equal to the number of AIO chunks, up to a maximum of 128.
The recommended number of AIO virtual processors depends on how many disks
your configuration supports. If KAIO is not implemented on your platform, you
should allocate one AIO virtual processor for each disk that contains database
tables. You can add an additional AIO virtual processor for each chunk that the
database server accesses frequently.
You can use the AUTO_AIOVPS configuration parameter to enable the database
server to automatically increase the number of AIO virtual processors and
page-cleaner threads when the server detects that AIO virtual processors are not
keeping up with the I/O workload.
The machine notes file for your version of the database server indicates whether
the operating system supports KAIO. If KAIO is supported, the machine notes
describe how to enable KAIO on your specific operating system.
If your operating system supports KAIO, the CPU VPs make asynchronous I/O
requests to the operating system instead of AIO virtual processors. In this case,
configure only one AIO virtual processor, plus two additional AIO virtual
processor for every file chunk that does not use KAIO.
If you use cooked files and if you enable direct I/O using the DIRECT_IO
configuration parameter, you can reduce the number of AIO virtual processors. If
the database server implements KAIO and if direct I/O is enabled, the database
server will attempt to use KAIO, so you probably do not need more than one AIO
virtual processor. Temporary dbspaces do not use direct I/O. If you have
temporary dbspaces, you will probably need more than one AIO virtual processors.
Even when direct I/O is enabled with the DIRECT_IO configuration parameter, if
the file system does not support either direct I/O or KAIO, you still must allocate
two additional AIO virtual processors for every active dbspace chunk that is not
using KAIO.
The goal in allocating AIO virtual processors is to allocate enough of them so that
the lengths of the I/O request queues are kept short (that is, the queues have as
few I/O requests in them as possible). When the I/O request queues remain
consistently short, I/O requests are processed as fast as they occur. Use the onstat
-g ioq command to monitor the length of the I/O queues for the AIO virtual
processors.
Allocate enough AIO VPs to accommodate the peak number of I/O requests.
Generally, allocating a few extra AIO VPs is not detrimental. To start additional
AIO VPs while the database server is in online mode, use the onmode -p
command. You cannot drop AIO VPs in online mode.
Related reference:
AUTO_AIOVPS configuration parameter (Administrator's Reference)
VPCLASS configuration parameter (Administrator's Reference)
The number of CPU VPs is used as a factor in determining the number of scan
threads for a query. Queries perform best when the number of scan threads is a
multiple (or factor) of the number of CPU VPs. Adding or removing a CPU VP can
improve performance for a large query because it produces an equal distribution of
scan threads among CPU VPs. For instance, if you have 6 CPU VPs and scan 10
table fragments, you might see a faster response time if you reduce the number of
CPU VPs to 5, which divides evenly into 10. You can use onstat -g ath to monitor
the number of scan threads per CPU VP or use onstat -g ses to focus on a
particular session.
Related reference:
MULTIPROCESSOR configuration parameter (Administrator's Reference)
Important: If you set the SINGLE_CPU_VP parameter to 1, the value of the num
option of the VPCLASS configuration parameter must also be 1.
Note: The database server treats user-defined virtual-processor classes (that is, VPs
defined with VPCLASS) as if they were CPU VPs. Thus, if you set
SINGLE_CPU_VP to nonzero, you cannot create any user-defined classes.
When you set the SINGLE_CPU_VP parameter to 1, you cannot add CPU VPs
while the database server is in online mode.
Related reference:
SINGLE_CPU_VP configuration parameter (Administrator's Reference)
VPCLASS configuration parameter (Administrator's Reference)
For a DSS query, you should set the value of OPTCOMPIND to 2 or 1, and you
should be sure that the isolation level is not set to Repeatable Read. For an OLTP
query, you could set the value to 0 or 1 with the isolation level not set to
Repeatable Read.
The value that you enter using the SET ENVIRONMENT OPTCOMPIND
command takes precedence over the default setting specified in the ONCONFIG
file. The default OPTCOMPIND setting is restored when the current session
terminates. No other user sessions are affected by SET ENVIRONMENT
OPTCOMPIND statements that you execute.
Related concepts:
OPTCOMPIND Environment Option (SQL Syntax)
For more information about how to control the use of PDQ resources, see “The
allocation of resources for parallel database queries” on page 12-7.
Related reference:
MAX_PDQPRIORITY configuration parameter (Administrator's Reference)
To calculate the number of scan threads allocated to a query, use the following
formula:
scan_threads = min (nfrags, (DS_MAX_SCANS * pdqpriority / 100
* MAX_PDQPRIORITY / 100) )
nfrags is the number of fragments in the table with the largest number of
fragments.
pdqpriority
is the PDQ priority value set by either the PDQPRIORITY environment
variable or the SQL statement SET PDQPRIORITY.
Reducing the number of scan threads can reduce the time that a large query waits
in the ready queue, particularly when many large queries are submitted
concurrently. However, if the number of scan threads is less than nfrags, the query
takes longer once it is underway.
You typically include a separate NETTYPE parameter for each connection type that
is associated with a dbservername. You list dbservernames in the
DBSERVERNAME and DBSERVERALIASES configuration parameters. You
associate connection types with dbservernames in the sqlhosts information. For
details about connection types and the sqlhosts information, see connectivity
information in your.IBM Informix Administrator's Guide.
Related reference:
“UNIX semaphore parameters” on page 3-1
NETTYPE configuration parameter (Administrator's Reference)
NETTYPE entries are required for connection types that are used for outgoing
communication only even if those connection types are not listed in the sqlhosts
information.
UNIX Only
The following protocols apply to UNIX platforms:
v IPCSHM
v TLITCP
v IPCSTR
v SOCTCP
v TLIIMC
v SOCIMC
v SQLMUX
v SOCSSL
Windows Only
The following protocols apply to Windows platforms:
v SOCTCP
v IPCNMP
v SQLMUX
v SOCSSL
Related reference:
A poll thread can support 1024 or perhaps more connections. If the FASTPOLL
configuration parameter is enabled, you might be able to configure fewer poll
threads, but should test the performance to determine the optimal configuration
for your environment.
Each NETTYPE entry configures the number of poll threads for a specific
connection type, the number of connections per poll thread, and the
virtual-processor class in which those poll threads run, using the following
comma-separated fields. There can be no white space within or between these
fields.
NETTYPE connection_type,poll_threads,conn_per_thread,vp_class
connection_type
identifies the protocol-interface combination to which the poll threads are
assigned. You typically set this field to match the connection_type field of a
dbservername entry that is in the sqlhosts information.
poll_threads
is the number of poll threads assigned to the connection type.
conn_per_thread
is the number of connections per poll thread. Use the following formula to
calculate this number:
conn_per_thread = connections / poll_threads
connections
is the maximum number of connections that you expect the
indicated connection type to support. For shared-memory
connections (ipcshm), double the number of connections for best
performance.
This field is used only for shared memory connections on
Windows. Other connection methods on Windows ignore this
value.
For shared memory connection, the value of conn_per_thread is the
maximum number of connections per thread. For network
connections, the value of conn_per_thread can be exceeded.
vp_class
is the class of virtual processor that can run the poll threads. Specify CPU
if you have a single poll thread that runs on a CPU VP. For best
performance, specify NET if you require more than one poll thread. If you
If the value of conn_per_thread exceeds 350 and the number of poll threads for the
current connection type is less than the number of CPU VPs, you can improve
performance by specifying the NET CPU class, adding poll threads (do not exceed
the number of CPU VPs), and recalculating the value of conn_per_thread. The
default value for conn_per_thread is 50.
For ipcshm, the number of poll threads correspond to the number of memory
segments. For example, if NETTYPE is set to 3,100 and you want one poll thread,
set the poll thread to 1,300.
Important: You should carefully distinguish between poll threads for network
connections and poll threads for shared memory connections, which should run
one per CPU virtual processor. TCP connections should only be in network virtual
processors, and you should only have the minimum needed to maintain
responsiveness. Shared memory connections should only be in CPU virtual
processors and should run in every CPU virtual processor.
Related concepts:
“Improve connection performance and scalability” on page 3-16
Related reference:
Informix SQL sessions can migrate across CPU VPs. You can improve the
performance and scalability of network connections on UNIX by using the
NUMFDSERVERS configuration parameter to specify a number for the poll threads
to use when distributing a TCP/IP connection across VPs. Specifying
NUMFDSERVERS information is useful if the database server has a high rate of
new connect and disconnect requests or if you find a high amount of contention
between network shared file (NSF) locks.
You should also review and, if necessary, change the information in the NETTYPE
configuration parameter, which defines the number of poll threads for a specific
connection type, the number of connections per poll thread, and the
virtual-processor class in which those poll threads run. You specify NETTYPE
configuration parameter information as follows:
NETTYPE connection_type,poll_threads,conn_per_thread,vp_class
For example, suppose you specify 8 poll threads in the NETTYPE configuration
parameter, as follows:
NETTYPE soctcp,8,300,NET
You can use the NS_CACHE configuration parameter to define the maximum
retention time for an individual entry in the host name/IP address cache, the
service cache, the user cache, and the group cache. The server can get information
from the cache faster than it does when querying the operating system.
You can improve service for connection requests by using multiple listen threads.
When you specify DBSERVERNAME and DBSERVERALIASES configuration
parameter information for onimcsoc or onsoctcp protocols, you can specify the
number of multiple listen threads for the database server aliases in your sqlhosts
information. The default value of number is 1.
You can use the onstat -g ath command to display information about all threads.
Related concepts:
3-16 IBM Informix Performance Guide
Name service maximum retention time set in the NS_CACHE configuration
parameter (Administrator's Guide)
“Specifying the number of connections and poll threads” on page 3-14
“Monitor threads with onstat -g ath output” on page 13-49
Related reference:
NETTYPE configuration parameter (Administrator's Reference)
NUMFDSERVERS configuration parameter (Administrator's Reference)
NS_CACHE configuration parameter (Administrator's Reference)
DBSERVERNAME configuration parameter (Administrator's Reference)
DBSERVERALIASES configuration parameter (Administrator's Reference)
Multiple listen threads (Administrator's Guide)
For example, if you have more than 300 concurrent connections with the database
server, you can enable the FASTPOLL configuration parameter for better
performance.
Related reference:
FASTPOLL configuration parameter (Administrator's Reference)
However, you must use this capability with care; the database server dynamically
allocates buffers of the indicated sizes for active connections. Unless you carefully
size buffers, they can use large amounts of memory. For details on how to size
network buffers, see “Network buffer size” on page 3-19.
The database server dynamically allocates network buffers from the global memory
pool for request messages from clients. After the database server processes client
requests, it returns buffers to a common network buffer pool that is shared among
sessions that use SOCTCP, IPCSTR, or TLITCP network connections.
The free network buffer pool can grow during peak activity periods. To prevent
large amounts of unused memory from remaining in these network buffer pools
when network activity is no longer high, the database server returns free buffers
when the number of free buffers reaches specific thresholds.
As the system administrator, you can control the free buffer thresholds and the size
of each buffer with the following methods:
v NETTYPE configuration parameter
v IFX_NETBUF_PVTPOOL_SIZE environment variable
v IFX_NETBUF_SIZE environment variable and b (client buffer size) option in the
sqlhosts information
Network buffers
The database server implements a threshold of free network buffers to prevent
frequent allocations and deallocations of shared memory for the network buffer
pool. This threshold enables the database server to correlate the number of free
network buffers with the number of connections that you specify in the NETTYPE
configuration parameter.
The database server dynamically allocates network buffers for request messages
from clients. After the database server processes client requests, it returns buffers
to the network free-buffer pool.
If the number of free buffers is greater than the threshold, the database server
returns the memory allocated to buffers over the threshold to the global pool.
The database server uses the following formula to calculate the threshold for the
free buffers in the network buffer pool:
free network buffers threshold =
100 + (0.7 * number_connections)
The value for number_connections is the total number of connections that you
specified in the third field of the NETTYPE entry for the different type of network
connections (SOCTCP, IPCSTR, or TLITCP). This formula does not use the
NETTYPE entry for shared memory (IPCSHM).
If you do not specify a value in the third field of the NETTYPE parameter, the
database server uses the default value of 50 connections for each NETTYPE entry
corresponding to the SOCTCP, TLITCP, and IPCSTR protocols.
For situations in which many connections and sessions are constantly active, these
private network buffers have the following advantages:
v Less contention for the common network buffer pool
v Fewer CPU resources to allocate and deallocate network buffers to and from the
common network buffer pool for each network transfer
Use the onstat utility commands in the following table to monitor the network
buffer usage.
The onstat -g ntu command displays the following format for the q-pvt output
field:
current number / highest number
If the number of free buffers (value in q-pvt field) is consistently 0, you can
perform one of the following actions:
v Increase the number of buffers with the environment variable
IFX_NETBUF_PVTPOOL_SIZE.
v Increase the size of each buffer with the environment variable
IFX_NETBUF_SIZE.
The q-exceeds field indicates the number of times that the threshold for the shared
network free-buffer pool was exceeded. When this threshold is exceeded, the
database server returns the unused network buffers (over this threshold) to the
global memory pool in shared memory. Optimally, this value should be 0 or a low
number so that the server is not allocating or deallocating network buffers from
the global memory pool.
Related reference:
IFX_NETBUF_PVTPOOL_SIZE environment variable (UNIX) (SQL Reference)
IFX_NETBUF_SIZE environment variable (SQL Reference)
Increase the value of IFX_NETBUF_SIZE if you know that clients send greater
than 4-kilobyte packets. Clients send large packets during any of the following
situations:
v Loading a table
v Inserting rows greater than 4 kilobytes
v Sending simple large objects
You can use the following onstat command to see the network buffer size:
onstat -g afr global | grep net
The size field in the output shows the network buffer size in bytes.
Related reference:
Connectivity configuration (Administrator's Guide)
IFX_NETBUF_SIZE environment variable (SQL Reference)
You can use onmode -p or ON-Monitor to start additional VPs for the following
classes while the database server is online: CPU, AIO, PIO, LIO, SHM, TLI, and
SOC. You can drop VPs of the CPU class only while the database server is online.
You should carefully distinguish between poll threads for network connections and
poll threads for shared memory connections, which should run one per CPU
virtual processor. TCP connections should only be in network virtual processors,
and you should only have the minimum needed to maintain responsiveness.
Shared memory connections should only be in CPU virtual processors and should
run in every CPU virtual processor
Adding more VPs can increase the load on CPU resources, so if the NETTYPE
value indicates that an available CPU VP can handle the poll thread, the database
server assigns the poll thread to that CPU VP. If all the CPU VPs have poll threads
assigned to them, the database server adds a second network VP to handle the poll
thread.
During start up, the database server calculates a target number of CPU VPs that
represents an even number equal to or greater than half the number of CPU
processors and compares the target number with the currently allocated number of
CPU VPs. The database server adds the necessary number of CPU VPs to equal the
target number.
If fewer than eight CPU VPs are configured, the server can dynamically add CPU
VPs to a total (configured plus added) of eight.
Use the auto_tune_cpu_vps task in the Scheduler to control the automatic addition
of CPU VPs. To prevent the automatic addition of CPU VPs, disable the
auto_tune_cpu_vps task in the ph_task table in the sysadmin database:
UPDATE ph_task
SET tk_enable = ’F’
WHERE tk_name = ’auto_tune_cpu_vps’;
The following table shows possible configurations and how many CPU VPs would
be added automatically in each situation.
Table 3-1. Example of how CPU VPs are automatically added
Automatically added
CPU processors Target CPU VPs Allocated CPU VPs CPU VPs
8 4 3 1
3 2 2 0
24 8 6 2
Related concepts:
“Setting the number of CPU VPs” on page 3-5
Use the onstat-g glo command to display information about each virtual processor
that is running and to display cumulative statistics for each virtual-processor class.
Use the onstat -g rea command to determine whether you need to increase the
number of virtual processors.
Related concepts:
“Monitor virtual processors with the onstat-g rea command”
Related reference:
onstat -g glo command: Print global multithreading information
(Administrator's Reference)
Use the onstat-g rea command to monitor the number of threads in the ready
queue.
If the number of threads in the ready queue is growing for a class of virtual
processors (for example, the CPU class), you might have to add more of those
virtual processors to your configuration.
Ready threads:
tid tcb rstcb prty status vp-class name
Related concepts:
“Monitor virtual processors with the onstat-g glo command” on page 3-21
Related reference:
onstat -g rea command: Print ready threads (Administrator's Reference)
Use the onstat-g ioq command to determine whether you need to allocate
additional AIO virtual processors.
The onstat-g ioq command displays the length of the I/O queues under the
column len, as the figure below shows. You can also see the maximum queue
onstat -g ioq
onstat -d
Dbspaces
address number flags fchunk nchunks flags owner name
a1de1d8 1 1 1 1 N informix rootdbs
a1df550 2 1 2 1 N informix space1
2 active, 32,678 maximum
Chunks
address chk/dbs offset size free bpages flags pathname
a1de320 1 1 0 75000 66447 PO- /ix/root_chunk
a1df698 2 2 0 500 447 PO- /ix//chunk1
2 active, 32,678 maximum
Each chunk serviced by the AIO virtual processors has one line in the onstat-g ioq
output, identified by the value gfd in the q name column. You can correlate the
line in onstat -g ioq with the actual chunk because the chunks are in the same
order as in the onstat -d output. For example, in the onstat-g ioq output, there are
two gfd queues. The first gfd queue holds requests for root_chunk because it
corresponds to the first chunk shown in the onstat -d output. Likewise, the second
gfd queue holds requests for chunk1 because it corresponds to the second chunk
in the onstat -d output.
If the database server has a mixture of raw devices and cooked files, the gfd
queues correspond only to the cooked files in onstat -d output.
Related reference:
onstat -g ioq command: Print I/O queue information (Administrator's
Reference)
You must connect to the sysmaster database to query the SMI tables. Query the
sysvpprof SMI table to obtain information about the virtual processors that are
currently running. This table contains the following columns.
Column Description
vpid ID number of the virtual processor
All memory allocations that are requested by threads in the database server are
fulfilled by memory pools. When a memory pool has insufficient memory blocks to
satisfy a memory allocation request, blocks are allocated from the global memory
pool. Because all threads use the same global memory pool, contention can occur.
Private memory caches allow each CPU virtual processor to retain its own set of
memory blocks that can be used to bypass the global memory pool. The initial
allocation for private memory caches is from the global memory pool. When the
blocks are freed, they are freed to the private memory cache on a specific CPU
virtual process. When a memory allocation is requested, the thread first checks
whether the allocation can be satisfied by blocks in the private memory cache.
Otherwise, the thread requests memory from the global memory pool.
To determine whether private memory caches might improve performance for your
database server, run the onstat -g spi command and look for the sh_lock mutex. If
the values from the onstat -g spi command output show contention for the
sh_lock mutex, try creating private memory caches.
You can view statistics about CPU VP private memory caches by running the
onstat -g vpcache command. You can view statistics about memory pools by
running the onstat -g mem command.
Attention: If you have multiple CPU VPs, private memory caches can increase
the amount of memory that the database server uses.
Related reference:
VP_MEMORY_CACHE_KB configuration parameter (Administrator's
Reference)
onstat -g vpcache command: Print CPU VP private memory cache statistics
(Administrator's Reference)
onstat -g mem command: Print pool memory statistics (Administrator's
Reference)
onstat -g spi command: Print spin locks with long spins (Administrator's
Reference)
The following topics describe ways that you might be able to reduce the system
CPU time required to open and close connections.
When a nonthreaded client uses a multiplexed connection, the database server still
creates the same number of user sessions and user threads as with a
nonmultiplexed connection. However, the number of network connections
decreases when you use multiplexed connections. Instead, the database server uses
a multiplex listener thread to allow the multiple database connections to share the
same network connection.
To improve response time for nonthreaded clients, you can use multiplexed
connections to execute SQL queries. The amount of performance improvement
depends on the following factors:
v The decrease in total number of network connections and the resulting decrease
in system CPU time
The usual cause for a large amount of system CPU time is the processing of
system calls for the network connection. Therefore, the maximum decrease in
system CPU time is proportional to the decrease in the total number of network
connections.
v The ratio of this decrease in system CPU time to the user CPU time
If the queries are simple and use little user CPU time, you might experience a
sizable reduction in response time when you use a multiplexed connection. But
if the queries are complex and use a large amount of user CPU time, you might
not experience a performance improvement.
To get an idea of the amounts of system CPU time and user CPU times per
virtual processor, use the onstat -g glo option.
To use multiplexed connections for a nonthreaded client application, you must take
the following steps before you bring up the database server:
1. Define an alias using the DBSERVERALIASES configuration parameter. For
example, specify:
DBSERVERALIASES ids_mux
2. Add an SQLHOSTS entry for the alias using sqlmux as the nettype entry, which
is the second column in the SQLHOSTS file. For example, specify:
ids_mux onsqlmux ......
The other fields in this entry, the hostname and servicename, must be present,
but they are ignored.
To monitor Informix MaxConnect, use the onstat -g imc command on the database
server computer and use the imcadmin command on the computer where Informix
MaxConnect is located.
Important: Informix MaxConnect and the IBM Informix MaxConnect User's Guide
ship separately from IBM Informix.
You can change the settings of the Informix configuration parameters that directly
affect memory utilization, and you can adjust the settings for different types of
workloads.
Consider the amount of physical memory that is available on your host when you
allocate shared memory for the database server by setting operating-system
configuration parameters. In general, if you increase space for database server
shared memory, you can enhance the performance of your database server. You
must balance the amount of shared memory that is dedicated to the database
server against the memory requirements for VPs and other processes.
Related concepts:
“The Memory Grant Manager” on page 12-6
Shared memory
You must configure adequate shared-memory resources for the database server in
your operating system. Insufficient shared memory can adversely affect
performance.
The database server threads and processes require shared memory to share data by
sharing access to segments of memory.
The shared memory that Informix uses can be divided into the following parts,
each of which has one or more shared memory segments:
v Resident portion
v Virtual portion
v Message portion
The resident and message portions are static; you must allocate sufficient memory
for them before you bring the database server into online mode. (Typically, you
must reboot the operating system to reconfigure shared memory.) The virtual
portion of shared memory for the database server grows dynamically, but you
must still include an adequate initial amount for this portion in your allocation of
operating-system shared memory.
The amount of space that is required is the total that all portions of database server
shared memory need. You specify the total amount of shared memory with the
SHMTOTAL configuration parameter.
The LOCKS configuration parameter specifies the initial size of the lock table. If
the number of locks that sessions allocate exceeds the value of LOCKS, the
database server dynamically increases the size of the lock table. If you expect the
lock table to grow dynamically, set SHMTOTAL to 0. When SHMTOTAL is 0, there
is no limit on total memory (including shared memory) allocation.
Related reference:
LOCKS configuration parameter (Administrator's Reference)
The settings that you use for the LOCKS, LOGBUFF, and PHYSBUFF configuration
parameters help determine the size of the resident portion.
In addition to these configuration parameters, which affect the size of the resident
portion, the RESIDENT configuration parameter can affect memory use. When a
computer supports forced residency and the RESIDENT configuration parameter is
set to a value that locks the resident or resident and virtual portions, the resident
portion is never paged out.
The machine notes file for your database server indicates whether your operating
system supports forced residency.
On AIX, Solaris, and Linux systems that support large pages, the
IFX_LARGE_PAGES environment variable can enable the use of large pages for
non-message shared memory segments that are locked in physical memory. If large
pages are configured by operating system commands and the RESIDENT
configuration parameter specifies that some or all of the resident and virtual
portions of shared memory are locked in physical memory, Informix uses large
pages for the corresponding shared memory segments, provided sufficient large
pages are available. The use of large pages can offer significant performance
benefits in large memory configurations.
Related reference:
“Configuration parameters that affect memory utilization” on page 4-8
IFX_LARGE_PAGES environment variable (SQL Reference)
The virtual portion of shared memory for the database server includes the
following components:
v Large buffers, which are used for large read and write I/O operations
v Sort-space pools
v Active thread-control blocks, stacks, and heaps
v User-session data
v Caches for SQL statements, data-dictionary information, and user-defined
routines
v A global pool for network-interface message buffers and other information
The size of the virtual portion depends primarily on the types of applications and
queries that you are running. Depending on your application, an initial estimate
for the virtual portion might be as low as 100 KB per user or as high as 500 KB per
user, plus an additional 4 megabytes if you intend to use data distributions.
On AIX, Solaris, and Linux systems that support large pages, the
IFX_LARGE_PAGES environment variable can enable the use of large pages for
non-message shared memory segments that are locked in physical memory. If large
pages are configured by operating system commands and the RESIDENT
configuration parameter specifies that some or all of the resident and virtual
portions of shared memory are locked in physical memory, Informix uses large
pages for the corresponding shared memory segments, provided sufficient large
pages are available. The use of large pages can offer significant performance
benefits in large memory configurations.
Related tasks:
“Creating data distributions” on page 13-13
Related reference:
“Configuration parameters that affect memory utilization” on page 4-8
IFX_LARGE_PAGES environment variable (SQL Reference)
EXTSHMADD configuration parameter (Administrator's Reference)
SHMADD configuration parameter (Administrator's Reference)
SHMTOTAL configuration parameter (Administrator's Reference)
SHMVIRTSIZE configuration parameter (Administrator's Reference)
If a particular interface is not used, you do not need to include space for it when
you allocate shared memory in the operating system.
The following estimate was calculated to determine the resident portion of shared
memory on a 64-bit server. The sizes that are shown are subject to change, and the
calculation is approximate.
The formula for estimating an initial size of the virtual portion of shared memory
is as follows:
shmvirtsize = fixed overhead + shared structures +
(mncs * private structures) +
other buffers
Tip: When the database server is running with a stable workload, you can use
onstat -g seg to obtain a precise value for the actual size of the virtual portion of
shared memory. You can then use the value for shared memory that this command
reports to reconfigure SHMVIRTSIZE.
To specify the size of segments that are added later to the virtual shared memory,
set the SHMADD configuration parameter. Use the EXTSHMADD configuration
parameter to specify the size of virtual-extension segments that are added for
user-defined routines and DataBlade routines.
The following table contains a list of additional topics for estimating the size of
shared structures in memory.
Table 4-1. Information for shared-memory structures
Shared-Memory Structure More Information
Sort memory “Estimating memory needed for sorting” on
page 7-19
Data-dictionary cache “Data-dictionary configuration” on page 4-22
Related concepts:
“Session memory” on page 4-38
Related reference:
SHMVIRTSIZE configuration parameter (Administrator's Reference)
NETTYPE configuration parameter (Administrator's Reference)
onstat -g mem command: Print pool memory statistics (Administrator's
Reference)
Estimate the size of the message portion of shared memory, using the following
formula:
msegsize = (10,531 * ipcshm_conn + 50,000)/1024
ipcshm_conn
is the number of connections that can be made using the shared-memory
interface, as determined by the NETTYPE parameter for the ipcshm
protocol.
Related reference:
NETTYPE configuration parameter (Administrator's Reference)
For additional tips on configuring shared memory in the operating system, see the
machine notes file for UNIX or the release notes file for Windows.
Related concepts:
“The SHMADD and EXTSHMADD configuration parameters and memory
utilization” on page 4-17
The database server does not automatically free the shared-memory segments that
it adds during its operations. After memory has been allocated to the database
server virtual portion, the memory remains unavailable for use by other processes
running on the host computer. When the database server runs a large
decision-support query, it might acquire a large amount of shared memory. After
the query completes, the database server no longer requires that shared memory.
However, the shared memory that the database server allocated to service the
query remains assigned to the virtual portion even though it is no longer needed.
The onmode -F command locates and returns unused 8-kilobyte blocks of shared
memory that the database server still holds. Although this command runs only
briefly (one or two seconds), onmode -F dramatically inhibits user activity while it
runs. Systems with multiple CPUs and CPU VPs typically experience less
degradation while this utility runs.
The SHMBASE parameter indicates the starting address for database server shared
memory. When set according to the instructions in the machine notes file or release
notes file, this parameter has no appreciable effect on performance. For the path
name of each file, see the Introduction to this guide.
Table 4-2 lists suggested settings for these parameters or guidelines for setting the
parameters.
For information about estimating the size of the resident portion of shared
memory, see “Estimating the size of the resident portion of shared memory” on
page 4-3. This calculation includes figuring the size of the buffer pool, logical-log
buffer, physical-log buffer, and lock table.
Table 4-2. Guidelines for OLTP and DSS applications
Configuration Parameter OLTP Applications DSS Applications
BUFFERPOOL The percentage of physical Set to a small buffer value and
memory that you need for increase the
buffer space depends on the DS_TOTAL_MEMORY value
amount of memory that is for light scans, queries, and
available on your system and sorts.
the amount of memory that is
used for other applications. For operations such as index
builds that read data through
the buffer pool, configure a
larger number of buffers.
DS_TOTAL_MEMORY Set to a value from 20 to 50 Set to a value from 50 to 90
percent of the value of percent of SHMTOTAL.
SHMTOTAL, in kilobytes.
LOGBUFF The default value for the Because database or table
logical log buffer size is 64 logging is usually turned off
KB. for DSS applications, you can
set LOGBUFF to 32 KB.
If you decide to use a smaller
value, the database server
generates a message a
message that indicates that
optimal performance might
not be obtained. Using a
logical log buffer smaller than
64 KB, impacts performance,
not transaction integrity.
If the
RTO_SERVER_RESTART
configuration parameter is
enabled, use the 512 kilobyte
default value for PHYSBUFF.
Related reference:
BUFFERPOOL configuration parameter (Administrator's Reference)
DS_TOTAL_MEMORY configuration parameter (Administrator's Reference)
LOGBUFF configuration parameter (Administrator's Reference)
PHYSBUFF configuration parameter (Administrator's Reference)
RTO_SERVER_RESTART configuration parameter (Administrator's Reference)
You can have multiple buffer pools if you have dbspaces that use different page
sizes. The onconfig configuration file contains a BUFFERPOOL line for each page
size. For example, on a computer with a 2 KB page size, the onconfig file can
contain up to nine lines, including the default specification. When you create a
dbspace with a different page size, a buffer pool for that page size is created
automatically, if it does not exist. A BUFFERPOOL entry for the page size is added
to the onconfig file. The values of the BUFFERPOOL configuration parameter
fields are the same as the default specification.
Increasing the number of buffers increases the likelihood that a needed data page
might already be in memory as the result of a previous request. However,
allocating too many buffers can affect the memory-management system and lead to
The size of the buffer pool has a significant effect on database I/O and transaction
throughput.
The size of the buffer pool is equal to the number of buffers multiplied by the page
size. The percentage of physical memory that you need for buffer space depends
on the amount of memory that you have available on your system and the amount
that is used for other applications. For systems with a large amount of available
physical memory (4 GB or more), buffer space might be as much as 90 percent of
physical memory. For systems with smaller amounts of available physical memory,
buffer space might range from 20 - 25 percent of physical memory.
For example, suppose that your system has a page size of 2 KB and 100 MB of
physical memory. You can set the value in the buffers field to 10,000 - 12,500,
which allocates 20 - 25 MB of memory.
Calculate all other shared-memory parameters after you specify the size of the
buffer pool.
Note: If you use non-default page sizes, you might need to increase the size of
your physical log. If you frequently update non-default pages, you might need a
150 - 200 percent increase of the physical log size. Some experimentation might be
needed to tune the physical log. You can adjust the size of the physical log as
necessary according to how frequently the filling of the physical log triggers
checkpoints.
You can use onstat -g buf to monitor buffer pool statistics, including the
read-cache rate of the buffer pool. This rate represents the percentage of database
pages that are already present in a shared-memory buffer when a query requests a
page. (If a page is not already present, the database server must copy it into
memory from disk.) If the database server finds the page in the buffer pool, it
spends less time on disk I/O. Therefore, you want a high read-cache rate for good
performance. For OLTP applications where many users read small sets of data, the
goal is to achieve a read cache rate of 95 percent or better.
If the read-cache rate is low, you can repeatedly increase buffers and restart the
database server. As you increase the BUFFERPOOL value of buffers, you reach a
point at which increasing the value no longer produces significant gains in the
read-cache rate, or you reach the upper limit of your operating-system
shared-memory allocation.
Depending upon your situation, you can take one of the following actions to
achieve better performance for applications that use smart large objects:
v If your applications frequently access smart large objects that are 2 KB or 4 KB
in size, use the buffer pool to keep them in memory longer. Use the following
formula to increase the value of the buffers field:
The database server can adjust the size of the quantum dynamically when it grants
memory. To allow for more simultaneous queries with smaller quanta each,
increase the value of the DS_MAX_QUERIES configuration parameter.
Related concepts:
The database server derives a value for DS_TOTAL_MEMORY if you do not set
the DS_TOTAL_MEMORY configuration parameter or if you set this configuration
parameter to an inappropriate value.
Whenever the database server changes the value that you assigned to
DS_TOTAL_MEMORY, it sends the following message to your console:
DS_TOTAL_MEMORY recalculated and changed from old_value Kb
to new_value Kb
When you receive the preceding message, you can use the algorithm to investigate
what values the database server considers inappropriate. You can then take
corrective action based on your investigation.
The following sections document the algorithm that the database server uses to
derive the new value for DS_TOTAL_MEMORY.
In the first part of the algorithm that the database server uses to derive the new
value for the DS_TOTAL_MEMORY configuration parameter, the database server
establishes a minimum amount for decision-support memory.
In the second part of the algorithm that the database server uses to derive the new
value for the DS_TOTAL_MEMORY configuration parameter, the database server
establishes a working value for the amount of decision-support memory.
The database server verifies this amount in the third and final part of the
algorithm.
When SHMTOTAL is set, the database server uses the following formula to
calculate the amount of decision-support memory:
IF DS_TOTAL_MEMORY <= SHMTOTAL - nondecision_support_memory THEN
decision_support_memory = DS_TOTAL_MEMORY
ELSE
decision_support_memory = SHMTOTAL -
nondecision_support_memory
When SHMTOTAL is not set, the database server sets decision-support memory
equal to the value that you specified in DS_TOTAL_MEMORY.
Related reference:
DS_TOTAL_MEMORY configuration parameter (Administrator's Reference)
When SHMTOTAL is set, the database server uses the following formula to
calculate the amount of decision-support memory:
decision_support_memory = SHMTOTAL -
nondecision_support_memory
When the database server finds that you did not set SHMTOTAL, it sets
decision-support memory as in the following example:
decision_support_memory = min_ds_total_memory
In the final part of the algorithm that the database server uses to derive the new
value for the DS_TOTAL_MEMORY configuration parameter, the database server
verifies that the amount of shared memory is greater than min_ds_total_memory and
less than the maximum possible memory space for your computer.
When the database server finds that the derived value for decision-support
memory is less than the value of the min_ds_total_memory variable, it sets
decision-support memory equal to the value of min_ds_total_memory.
When the database server finds that the derived value for decision-support
memory is greater than the maximum possible memory space for your computer, it
sets decision-support memory equal to the maximum possible memory space.
If you log smart large objects, increase the size of the logical-log buffers to prevent
frequent flushing to the logical-log file on disk.
Related reference:
LOGBUFF configuration parameter (Administrator's Reference)
Choose a value for PHYSBUFF that is an even increment of the system page size.
Related reference:
PHYSBUFF configuration parameter (Administrator's Reference)
If the number of locks needed by sessions exceeds the value set in the LOCKS
configuration parameter, the database server attempts to increase the lock table by
doubling its size. Each time that the lock table overflows (when the number of
locks needed is greater than the current size of the lock table), the database server
increases the size of the lock table, up to 99 times. Each time that the database
server increases the size of the lock table, the server attempts to double its size.
However, the server will limit each actual increase to no more than the maximum
number of added locks shown in Table 4-3 on page 4-16. After the 99th time that
The following table shows the maximum number of locks allowed on 32-bit and
64-bit platforms
Table 4-3. Maximum number of locks on 32-bit and 64-bit platforms
Maximum
Maximum Number of
Maximum Number of Locks Added Maximum
Number of Dynamic Lock Per Lock Table Number of
Platform Initial Locks Table Extensions Extension Locks Allowed
32-bit 8,000,000 99 100,000 8,000,000 + (99 x
100,000)
64-bit 500,000,000 99 1,000,000 500,000,000 + (99
x 1,000,000)
To estimate a different value for the LOCKS configuration parameter, estimate the
maximum number of locks that a query needs and multiply this estimate by the
number of concurrent users. You can use the guidelines in the following table to
estimate the number of locks that a query needs.
The resident portion in the database server contains the buffer pools that are used
for database read and write activity. Performance improves when these buffers
remain in physical memory.
You should set the RESIDENT parameter to 1. If forced residency is not an option
on your computer, the database server issues an error message and ignores this
configuration parameter.
On machines that support 64-bit addressing, you can have a very large buffer pool
and the virtual portion of database server shared memory can also be very large.
The virtual portion contains various memory caches that improve performance of
multiple queries that access the same tables (see “Configure and monitor memory
caches” on page 4-21). To make the virtual portion resident in physical memory in
addition to the resident portion, set the RESIDENT parameter to -1.
If your buffer pool is very large, but your physical memory is not very large, you
can set RESIDENT to a value greater than 1 to indicate the number of memory
segments to stay in physical memory. This specification makes only a subset of the
buffer pool resident.
You can turn residency on or off for the resident portion of shared memory in the
following ways:
v Use the onmode utility to reverse temporarily the state of shared-memory
residency while the database server is online.
v Change the RESIDENT parameter to turn shared-memory residency on or off the
next time that you start database server shared memory.
Related reference:
RESIDENT configuration parameter (Administrator's Reference)
Adding shared memory uses CPU cycles. The larger each increment, the fewer
increments are required, but less memory is available for other processes. Adding
large increments is generally preferred; but when memory is heavily loaded (the
scan rate or paging-out rate is high), smaller increments allow better sharing of
memory resources among competing programs.
The range of values for SHMADD is 1024 through 4294967296 KB for a 64-bit
operating system and 1024 through 524288 KB for a 32-bit operating system. The
following table contains recommendations for setting SHMADD according to the
size of physical memory.
The range of values for EXTSHMADD is the same as the range of values of
SHMADD.
You can usually set the SHMTOTAL configuration parameter to 0, except in the
following cases:
v You must limit the amount of virtual memory that the database server uses for
other applications or other reasons.
v Your operating system runs out of swap space and performs abnormally. In this
case, you can set SHMTOTAL to a value that is a few megabytes less than the
total swap space that is available on your computer.
v You are using automatic low memory management.
Related reference:
4-18 IBM Informix Performance Guide
SHMTOTAL configuration parameter (Administrator's Reference)
Although the database server adds increments of shared memory to the virtual
portion as needed to process large queries or peak loads, allocation of shared
memory increases time for transaction processing. Therefore, you should set
SHMVIRTSIZE to provide a virtual portion large enough to cover your normal
daily operating requirements. The size of SHMVIRTSIZE can be as large the
SHMMAX configuration parameter allows.
For an initial setting, it is suggested that you use the larger of the following values:
v 8000
v connections * 350
The connections variable is the number of connections for all network types that are
specified in the sqlhosts information by one or more NETTYPE configuration
parameters. (The database server uses connections * 200 by default.)
Once system utilization reaches a stable workload, you can reconfigure a new
value for SHMVIRTSIZE. As noted in “Freeing shared memory with onmode -F”
on page 4-7, you can instruct the database server to release shared-memory
segments that are no longer in use after a peak workload or large query.
Related reference:
SHMVIRTSIZE configuration parameter (Administrator's Reference)
Example 1:
This specifies that if the database serve has 3000 kilobytes remaining in virtual
memory and additional kilobytes of memory cannot be allocated, the server raises
an alarm level of 4.
Example 2:
SHMVIRT_ALLOCSEG .8, 4
This specifies that if the database server has twenty percent remaining in virtual
memory and additional kilobytes of memory cannot be allocated, the server raises
an alarm level of 4.
Related reference:
Event Alarm Parameters (Administrator's Reference)
SHMVIRT_ALLOCSEG configuration parameter (Administrator's Reference)
To reduce the amount of shared memory that the database server adds
dynamically, estimate the amount of the stack space required for the average
number of threads that your system runs and include that amount in the value
that you set for the SHMVIRTSIZE configuration parameter.
To estimate the amount of stack space that you require, use the following formula:
stacktotal = STACKSIZE * avg_no_of_threads
avg_no_of_threads
is the average number of threads. You can monitor the number of active
threads at regular intervals to determine this amount. Use onstat -g sts to
check the stack use of threads. A general estimate is between 60 and 70
percent of the total number of connections (specified in the NETTYPE
parameters in your ONCONFIG file), depending on your workload.
The database server also executes user-defined routines (UDRs) with user threads
that use this stack. Programmers who write user-defined routines should take the
following measures to avoid stack overflow:
v Do not use large automatic arrays.
v Avoid excessively deep calling sequences.
v For DB-Access only: Use mi_call to manage recursive calls.
If you cannot avoid stack overflow with these measures, use the STACK modifier
of the CREATE FUNCTION statement to increase the stack for a particular routine.
Related reference:
STACKSIZE configuration parameter (Administrator's Reference)
The following table lists the main memory caches that have the greatest effect on
performance and how to configure and monitor those caches.
Table 4-4. Main memory caches
onstat
Cache Name Cache Description Configuration Parameters command
Data Dictionary Stores information about the table DD_HASHSIZE: The maximum onstat -g dic
definition (such as column names and number of buckets in the cache.
data types).
DD_HASHMAX: The number of
tables in each bucket
Data Distribution Stores distribution statistics for a column. DS_POOLSIZE: The maximum onstat -g dsc
number of entries in the cache.
STMT_CACHE_NOLIMIT: Prohibit
entries into the SQL statement cache
when allocated memory exceeds the
value of the STMT_CACHE_SIZE
configuration parameter.
STMT_CACHE_NUMPOOL: The
number of memory pools for the
SQL statement cache.
Related concepts:
“SPL routine executable format stored in UDR cache” on page 10-33
Related reference:
onstat -g cac command: Print information about caches (Administrator's
Reference)
Data-dictionary cache
The first time that the database server accesses a table, it retrieves the information
that it needs about the table (such as the column names and data types) from the
system catalog tables on disk. After the database server has accessed the table, it
places that information in the data-dictionary cache in shared memory.
Figure 4-1 shows how the database server uses this cache for multiple users. User 1
accesses the column information for tabid 120 for the first time. The database
server puts the column information in the data-dictionary cache. When user 2, user
3 and user 4 access the same table, the database server does not have to read from
disk to access the data-dictionary information for the table. Instead, it reads the
dictionary information from the data-dictionary cache in memory.
The database server still places pages for system catalog tables in the buffer pool,
as it does all other data and index pages. However, the data-dictionary cache offers
an additional performance advantage, because the data-dictionary information is
organized in a more efficient format and organized to allow fast retrieval.
Data-dictionary configuration
The database server uses a hashing algorithm to store and locate information
within the data-dictionary cache. The DD_HASHSIZE and DD_HASHMAX
configuration parameters control the size of the data-dictionary cache.
For medium to large systems, you can start with the following values for these
configuration parameters:
v DD_HASHSIZE: 503
v DD_HASHMAX: 4
If the bucket reaches the maximum size, the database server uses a least recently
used mechanism to clear entries from the data dictionary.
Related reference:
DD_HASHSIZE configuration parameter (Administrator's Reference)
DD_HASHMAX configuration parameter (Administrator's Reference)
Data-distribution cache
The query optimizer uses distribution statistics generated by the UPDATE
STATISTICS statement in the MEDIUM or HIGH mode to determine the query
plan with the lowest cost. The first time that the optimizer accesses the distribution
statistics for a column, the database server retrieves the statistics from the
sysdistrib system catalog table on disk and places that information in the
data-distribution cache in memory.
Figure 4-2 shows how the database server accesses the data-distribution cache for
multiple users. When the optimizer accesses the column distribution statistics for
User 1 for the first time, the database server puts the distribution statistics in the
data-distribution cache. When the optimizer determines the query plan for user 2,
user 3 and user 4 who access the same column, the database server does not have
to read from disk to access the data-distribution information for the table. Instead,
it reads the distribution statistics from the data-distribution cache in shared
memory.
The database server initially places pages for the sysdistrib system catalog table in
the buffer pool as it does all other data and index pages. However, the
data-distribution cache offers additional performance advantages. It:
v Is organized in a more efficient format
v Is organized to allow fast retrieval
v Bypasses the overhead of the buffer pool management
v Frees more pages in the buffer pool for actual data pages rather than system
catalog pages
v Reduces I/O operations to the system catalog table
Data-distribution configuration
The database server uses a hashing algorithm to store and locate information
within the data-distribution cache. The DS_POOLSIZE configuration parameter
The following formula determines the number of column distributions that can be
stored in one bucket.
Distributions_per_bucket = DS_POOLSIZE / DS_HASHSIZE
For example, with the default values of 127 for DS_POOLSIZE and 31 for
DS_HASHSIZE, you can potentially store distributions for about 127 columns in
the data-distribution cache. The cache has 31 hash buckets, and each hash bucket
can have an average of 4 entries.
The values that you set for DS_HASHSIZE and DS_POOLSIZE, depend on the
following factors:
v The number of columns for which you run the UPDATE STATISTICS statement
in HIGH or MEDIUM mode and you expect to be used most often in frequently
run queries.
If you do not specify columns when you run UPDATE STATISTICS for a table,
the database server generates distributions for all columns in the table.
You can use the values of DD_HASHSIZE and DD_HASHMAX as guidelines for
DS_HASHSIZE and DS_POOLSIZE. The DD_HASHSIZE and DD_HASHMAX
specify the size for the data-dictionary cache, which stores information and
statistics about tables that queries access.
For medium to large systems, you can start with the following values:
– DD_HASHSIZE 503
– DD_HASHMAX 4
– DS_HASHSIZE 503
– DS_POOLSIZE 2000
Monitor these caches by running the onstat -g dsc command to see the actual
usage, and you can adjust these parameters accordingly.
v The amount of memory available
The amount of memory that is required to store distributions for a column
depends on the level at which you run UPDATE STATISTICS. Distributions for a
single column might require between 1 KB and 2 MB, depending on whether
you specify medium or high mode or enter a finer resolution percentage when
you run UPDATE STATISTICS.
If the size of the data-distribution cache is too small, the following performance
problems can occur:
v The database server uses the DS_POOLSIZE value to determine when to remove
entries from the data-distribution cache. However, if the optimizer needs the
dropped distributions for another query, the database server must reaccess them
from the sysdistrib system catalog table on disk. The additional I/O and buffer
pool operations to access sysdistrib on disk adds to the total response time of
the query.
The database server tries to maintain the number of entries in data-distribution
cache at the DS_POOLSIZE value. If the total number of entries reaches within
4-24 IBM Informix Performance Guide
an internal threshold of DS_POOLSIZE, the database server uses a least recently
used mechanism to remove entries from the data-distribution cache. The number
of entries in a hash bucket can go past this DS_POOLSIZE value, but the
database server eventually reduces the number of entries when memory
requirements drop.
v If DS_HASHSIZE is small and DS_POOLSIZE is large, overflow lists can be long
and require more search time in the cache.
Overflow occurs when a hash bucket already contains an entry. When multiple
distributions hash to the same bucket, the database server maintains an overflow
list to store and retrieve the distributions after the first one.
If DS_HASHSIZE and DS_POOLSIZE are approximately the same size, the
overflow lists might be smaller or even nonexistent, which might waste memory.
However, the amount of unused memory is insignificant overall.
You might want to change the values of the DS_HASHSIZE and DS_POOLSIZE
configuration parameters if you see the following situations:
v If the data-distribution cache is full most of the time and commonly used
columns are not listed in the distribution name field, try increasing the values
of the DS_HASHSIZE and DS_POOLSIZE configuration parameters.
v If the total number of entries is much lower than the value of the DS_POOLSIZE
configuration parameter, you can reduce the values of the DS_HASHSIZE and
DS_POOLSIZE configuration parameters.
Related reference:
DD_HASHSIZE configuration parameter (Administrator's Reference)
DD_HASHMAX configuration parameter (Administrator's Reference)
DS_POOLSIZE configuration parameter (Administrator's Reference)
onstat -g dsc command: Print distribution cache information (Administrator's
Reference)
For more information about the effect of the SQL statement cache on the
performance of individual queries, see “Optimize queries with the SQL statement
cache” on page 13-40.
Figure 4-3 on page 4-26 shows how the database server accesses the SQL statement
cache for multiple users.
v When the database server runs an SQL statement for User 1 for the first time,
the database server checks whether the same exact SQL statement is in the SQL
statement cache. If it is not in the cache, the database server parses the
statement, determines the optimal query plan, and runs the statement.
v When User 2 runs the same SQL statement, the database server finds the
statement in the SQL statement cache and does not optimize the statement.
2. Parse 2. Parse
3. Execute 3. Optimize
4. Execute
Figure 4-3. Database server actions when using the SQL statement cache
If a session prepares a statement and then executes it many times, the SQL
statement cache does not affect performance, because the statement is optimized
just once during the PREPARE statement.
However, if other sessions also prepare that same statement, or if the first session
prepares the statement several times, the statement cache usually provides a direct
performance benefit, because the database server only calculates the query plan
once. Of course, the original session might gain a small benefit from the statement
cache, even if it only prepares the statement once, because other sessions use less
memory, and the database server does less work for the other sessions
For more information about how the value of the STMT_CACHE configuration
parameter enables the SQL statement cache, see “Enabling the SQL statement
cache” on page 13-42 describes.
Figure 4-4 shows how the database server uses the values of the pertinent
configuration parameters for the SQL statement cache. Further explanation follows
the figure.
User 1 User 4
A
A Allocate memory from Pools
1. Check Cache 1. Check Cache specified by parameter:
- Match - No Match - STMT_CACHE_NUMPOOL
Figure 4-4. Configuration parameters that affect the SQL statement cache
When the database server uses the SQL statement cache for a user, it means the
database server takes the following actions:
v Checks the SQL statement cache first for a match of the SQL statement that the
user is executing
v If the SQL statement matches an entry, executes the statement using the query
memory structures in the SQL statement cache (User 2 in Figure 4-4)
v If the SQL statement does not match an entry, the database server checks if it
qualifies for the cache.
For information about what qualifies an SQL statement for the cache, see SQL
statement cache qualifying criteria.
The following parameters affect whether or not the database server inserts the SQL
statement into the cache (User 1 in Figure 4-4 on page 4-27):
v STMT_CACHE_HITS specifies the number of times the statement executes with
an entry in the cache (referred to as hit count). The database server inserts one of
the following entries, depending on the hit count:
– If the value of STMT_CACHE_HITS is 0, inserts a fully cached entry, which
contains the text of the SQL statement plus the query memory structures
– If the value of STMT_CACHE_HITS is not 0 and the statement does not exist
in the cache, inserts a key-only entry that contains the text of the SQL
statement. Subsequent executions of the SQL statement increment the hit
count.
– If the value of STMT_CACHE_HITS is equal to the number of hits for a
key-only entry, adds the query memory structures to make a fully cached
entry.
v STMT_CACHE_SIZE specifies the size of the SQL statement cache, and
STMT_CACHE_NOLIMIT specifies whether or not to limit the memory of the
cache to the value of STMT_CACHE_SIZE. If you do not specify the
STMT_CACHE_SIZE parameter, it defaults to 524288 (512 * 1024) bytes.
The default value for STMT_CACHE_NOLIMIT is 1, which means the database
server will insert entries into the SQL statement cache even though the total
amount of memory might exceed the value of STMT_CACHE_SIZE.
When STMT_CACHE_NOLIMIT is set to 0, the database server inserts the SQL
statement into the cache if the current size of the cache will not exceed the
memory limit.
Monitor the number of hits on the SQL statement cache to determine if your
workload is using this cache effectively. The following sections describe ways to
monitor the SQL statement cache hits.
Related concepts:
To monitor the number of hits in the SQL statement cache, run the onstat -g ssc
command.
The onstat -g ssc command displays fully cached entries in the SQL statement
cache. Figure 4-5 shows sample output for onstat -g ssc.
onstat -g ssc
To monitor the number of times that the database server reads the SQL statement
within the cache, look at the following output columns:
v In the Statement Cache Summary portion of the onstat -g ssc output, the #hits
column is the value of the SQL_STMT_HITS configuration parameter.
In Figure 4-5, the #hits column in the Statement Cache Summary portion of the
output has a value of 0, which is the default value of the STMT_CACHE_HITS
configuration parameter.
For a complete description of the output fields that onstat -g ssc displays, see
“SQL statement cache information in onstat -g ssc output” on page 4-36.
To determine how many nonshared entries exist in the SQL statement cache, run
onstat -g ssc all.
The onstat -g ssc all option displays the key-only entries in addition to the fully
cached entries in the SQL statement cache.
You can use one of the following methods to change the STMT_CACHE_HITS
parameter value:
v Update the ONCONFIG file to specify the STMT_CACHE_HITS configuration
parameter. You must restart the database server for the new value to take effect.
You can use a text editor to edit the ONCONFIG file. Then bring down the
database server with the onmode -ky command and restart with the oninit
command.
v Increase the STMT_CACHE_HITS configuration parameter dynamically while
the database server is running:
You can use any of the following methods to reset the STMT_CACHE_HITS
value at run time:
– Issue the onmode -W command. The following example specifies that three
(3) instances are required before a new query is added to the statement cache:
onmode -W STMT_CACHE_HITS 2
You can set the size of the SQL statement cache in memory with the
STMT_CACHE_SIZE configuration parameter. The value of the parameter is the
size in kilobytes. If STMT_CACHE_SIZE is not set, the default value is 512
kilobytes.
The onstat -g ssc output shows the value of STMT_CACHE_SIZE in the maxsize
column. In Figure 4-5 on page 4-29, this maxsize column has a value of 524288,
which is the default value (512 * 1024 = 524288).
Use the onstat -g ssc and onstat -g ssc all options to monitor the effectiveness of
size of the SQL statement cache. If you do not see cache entries for the SQL
statements that applications use most, the SQL statement cache might be too small
or too many unshared SQL statement occupy the cache. The following sections
describe how to determine these situations.
Related reference:
STMT_CACHE_SIZE configuration parameter (Administrator's Reference)
You can analyze onstat -g ssc all output to determine if the SQL statement cache is
too small. If the size of the cache is too small, you can change it.
When the database server places many queries that are only used once in the
cache, they might replace statements that other applications use often. You can
view onstat -g ssc all output to determine if too many unshared SQL statements
occupy the cache. If so, you can prevent unshared SQL statements from being fully
cached.
Look at the values in the following output columns in the Statement Cache
Entries portion of the onstat -g ssc all output. If you see a lot of entries that have
both of the following values, too many unshared SQL statements occupy the cache:
v flags column value of F in the second position
A value of F in the second position indicates that the statement is currently fully
cached.
v hits column value of 0 or 1
The hits column shows the number of times the SQL statement has been
executed, excluding the first time it is inserted into the cache.
Use the onstat -g ssc option to monitor the current size of the SQL statement
cache. Look at the values in the following output columns of the onstat -g ssc
output:
v The currsize column shows the number of bytes currently allocated in the SQL
statement cache.
In Figure 4-5 on page 4-29, the currsize column has a value of 11264.
v The maxsize column shows the value of STMT_CACHE_SIZE.
In Figure 4-5 on page 4-29, the maxsize column has a value of 524288, which is
the default value (512 * 1024 = 524288).
When the SQL statement cache is full and users are currently executing all
statements within it, any new SQL statements that a user executes can cause the
SQL statement cache to grow beyond the size that STMT_CACHE_SIZE specifies.
When the database server is no longer using an SQL statement within the SQL
statement cache, it frees memory in the SQL statement cache until the size reaches
a threshold of STMT_CACHE_SIZE. However, if thousands of concurrent users are
executing several ad hoc queries, the SQL statement cache can grow very large
before any statements are removed. In such cases, take one of the following
actions:
v Set the STMT_CACHE_NOLIMIT configuration parameter to 0 to prevent
insertions when the cache size exceeds the value of the STMT_CACHE_SIZE
parameter.
v Set the STMT_CACHE_HITS parameter to a value greater than 0 to prevent
caching unshared SQL statements.
You can use one of the following methods to change the STMT_CACHE_NOLIMIT
configuration parameter value:
v Update the ONCONFIG file to specify the STMT_CACHE_NOLIMIT
configuration parameter. You must restart the database server for the new value
to take effect.
v Use the onmode -W command to override the STMT_CACHE_NOLIMIT
configuration parameter dynamically while the database server is running.
onmode -W STMT_CACHE_NOLIMIT 0
If you restart the database server, the value reverts the value in the ONCONFIG
file. Therefore, if you want the setting to remain for subsequent restarts, modify
the ONCONFIG file.
This one pool can become a bottleneck as the number of users increases. The
STMT_CACHE_NUMPOOL configuration parameter allows you to configure
multiple sscpools.
You can monitor the pools in the SQL statement cache to determine the following
situations:
v The number of SQL statement cache pools is sufficient for your workload.
v The size or limit of the SQL statement cache is not causing excessive memory
management.
Related reference:
STMT_CACHE_NUMPOOL configuration parameter (Administrator's
Reference)
When the SQL statement cache (SSC) is enabled, the database server allocates
memory from the SSC pool for unlinked SQL statements. The default value for the
STMT_CACHE_NUMPOOL configuration parameter is 1. As the number of users
increases, this one SSC pool might become a bottleneck.
The number of longspins on the SSC pool indicates whether or not the SSC pool is
a bottleneck.
Use the onstat -g spi option to monitor the number of longspins on an SSC pool.
The onstat -g spi command displays a list of the resources in the system for which
a wait was required before a latch on the resource could be obtained. During the
wait, the thread spins (or loops), trying to acquire the resource. The onstat -g spi
output displays the number of times a wait (Num Waits column) was required for
the resource and the number of total loops (Num Loops column). The onstat -g spi
output displays only resources that have at least one wait.
Figure 4-6 shows an excerpt of sample output for onstat -g spi. Figure 4-6 indicates
that no waits occurred for any SSC pool (the Name column does not list any SSC
pools).
Use the onstat -g ssc pool option to monitor the usage of each SQL statement
cache (SSC) pool.
The onstat -g ssc pool command displays the size of each pool. The onstat -g ssc
option displays the cumulative size of the SQL statement cache in the currsize
column. This current size is the size of memory allocated from the SSC pools by
the statements that are inserted into the cache. Because not all statements that
allocate memory from the SSC pools are inserted into the cache, the current cache
size could be smaller than the total size of the SSC pools. Normally, the total size
of all SSC pools does not exceed the STMT_CACHE_SIZE value.
Pool Summary:
name class addr totalsize freesize #allocfrag #freefrag
sscpool0 V a7e4020 57344 2352 52 7
Blkpool Summary:
name class addr size #blks
The Pool Summary section of the onstat -g ssc pool output lists the following
information for each pool in the cache.
Column Description
name The name of the SQL statement cache (SSC)
pool
class The shared-memory segment type in which
the pool has been created. For SSC pools, this
value is always “V” for the virtual portion of
shared-memory.
addr The shared-memory address of the SSC pool
structure
totalsize The total size, in bytes, of this SSC pool
freesize the number of free bytes in this SSC pool
#allocfrag The number of contiguous areas of memory
in this SSC pool that are allocated
#freefrag The number of contiguous areas of memory
that are not used in this SSC pool
Column Description
name The name of the SSC pool
class The shared-memory segment type in which
the pool has been created. For SSC pools, this
value is always “V” for the virtual portion of
shared-memory.
addr The shared-memory address of the SSC pool
structure
totalsize The total size, in bytes, of this SSC pool
#blks The number of 8-kilobyte blocks that make
up all the SSC pools
The onstat -g ssc command displays the following information for the SQL
statement cache.
Table 4-5. SQL statement cache information in onstat -g ssc output
Column Description
#lrus The number of LRU queues. Multiple LRU
queues facilitate concurrent lookup and
insertion of cache entries.
currsize The number of bytes currently allocated to
entries in the SQL statement cache
maxsize The number of bytes specified in the
STMT_CACHE_SIZE configuration
parameter
poolsize The cumulative number of bytes for all pools
in the SQL statement cache. Use the onstat -g
ssc pool option to monitor individual pool
usage.
#hits Setting of the STMT_CACHE_HITS
configuration parameter, which specifies the
number of times that a query is executed
before it is inserted into the cache
nolimit Setting of STMT_CACHE_NOLIMIT
configuration parameter
The onstat -g ssc command lists the following information for each fully cached
entry in the cache. The onstat -g ssc all option lists the following information for
both the fully cached entries and key-only entries.
Column Description
lru The LRU identifier
hash The hash-bucket identifier
Use the following utility options to determine which session and prepared SQL
statements have high memory utilization:
v onstat -g mem
v onstat -g stm
The onstat -g mem option displays memory usage of all sessions. You can find the
session that is using the most memory by looking at the totalsize and freesize
output columns. The following figure shows sample output for onstat -g mem.
This sample output shows the memory utilization of three user sessions with the
values 14, 16, 17 in the names output column.
onstat -g mem
Pool Summary:
name class addr totalsize freesize #allocfrag #freefrag
...
14 V a974020 45056 11960 99 10
16 V a9ea020 90112 10608 159 5
17 V a973020 45056 11304 97 13
...
Blkpool Summary:
name class addr size #blks
mt V a235688 798720 19
global V a232800 0 0
To display the memory allocated by each prepared statement, use the onstat -g stm
option. The following figure shows sample output for onstat -g stm.
onstat -g stm
session 25 --------------------------------------------------
sdblock heapsz statement ('*’ = Open cursor)
d36b018 9216 select sum(i) from t where i between -1 and ?
d378018 6240 *select tabname from systables where tabid=7
d36b114 8400 <SPL statement>
The heapsz column in the output in Figure 4-9 shows the amount of memory used
by the statement. An asterisk (*) precedes the statement text if a cursor is open on
the statement. The output does not show the individual SQL statements in an SPL
routine.
To display the memory for only one session, specify the session ID in the onstat -g
stm option. For an example, see “Monitor session memory with onstat -g mem
and onstat -g stm output” on page 13-52.
The data-replication buffer is always the same size as the logical-log buffer.
Memory latches
The database server uses latches to control access to shared memory structures
such as the buffer pool or the memory pools for the SQL statement cache. You can
obtain statistics on latch use and information about specific latches. These statistics
provide a measure of the system activity.
The statistics include the number of times that threads waited to obtain a latch. A
large number of latch waits typically results from a high volume of processing
activity in which the database server is logging most of the transactions.
Information about specific latches includes a listing of all the latches that are held
by a thread and any threads that are waiting for latches. This information allows
you to locate any specific resource contentions that exist.
You, as the database administrator, cannot configure or tune the number of latches.
However, you can increase the number of memory structures on which the
database server places latches to reduce the number of latch waits. For example,
you can tune the number of SQL statement cache memory pools or the number of
SQL statement cache LRU queues. For more information, see “Multiple SQL
statement cache pools” on page 4-34.
Warning: Never stop a database server process that is holding a latch. If you do,
the database server immediately initiates an abort.
Figure 4-10 shows an excerpt of sample onstat -p output that shows the lchwaits
field.
...
ixda-RA idx-RA da-RA logrec-RA RA-pgsused lchwaits
5 0 204 0 148 12
Related reference:
You can compare this address with the user addresses in the onstat -u output to
obtain the user-process identification number.
...
Latches with lock or userthread set
name address lock wait userthread
LRU1 402e90 0 0 6b29d8
bf[34] 4467c0 0 0 6b29d8
...
The latchwts column of the sysprofile table contains the number of times that a
thread waited for a latch.
Encrypted values
An encrypted value uses more storage space than the corresponding plain text
value because all of the information needed to decrypt the value except the
encryption key is stored with the value.
Omitting the hint used with the password can reduce encryption overhead by up
to 50 bytes. If you are using encrypted values, you must make sure that you have
sufficient space available for the values.
All the data that resides in a database is stored on disk. The speed at which the
database server can copy the appropriate data pages to and from disk determines
how well your application performs.
All the data that resides in a database is stored on disk. The Optical Subsystem
also uses a magnetic disk to access TEXT or BYTE data that is retrieved from
optical media. The speed at which the database server can copy the appropriate
data pages to and from disk determines how well your application performs.
Disks are typically the slowest component in the I/O path for a transaction or
query that runs entirely on one host computer. Network communication can also
introduce delays in client/server applications, but these delays are typically
outside the control of the database server administrator. For information about
actions that the database server administrator can take to improve network
communications, see “Network buffer pools” on page 3-17 and “Connections and
CPU utilization” on page 3-25.
Disks can become overused or saturated when users request pages too often.
Saturation can occur in the following situations:
v You use a disk for multiple purposes, such as for both logging and active
database tables.
v Disparate data resides on the same disk.
v Table extents become interleaved.
Together with round-robin fragmentation, you can balance chunks over disks and
controllers, saving time and handling errors. Placing multiple chunks on a single
disk can improve throughput.
For more information about table placement and layout, see Chapter 6, “Table
performance considerations,” on page 6-1.
When dbspaces reside on raw disk devices (also called character-special devices), the
database server uses unbuffered disk access. A raw disk directly transfers data
between the database server memory and disk without also copying data.
To determine the best performance, perform benchmark testing for the dbspace
and table layout on your system.
Direct I/O also allows the use of kernel asynchronous I/O (KAIO), which can
further improve performance. By using direct I/O and KAIO where available, the
performance of cooked files used for dbspace chunks can approach the
performance of raw devices.
If your file system supports direct I/O for the page size used for the dbspace
chunk, the database server operates as follows:
v Does not use direct I/O by default.
v Uses direct I/O if the DIRECT_IO configuration parameter is set to 1.
v Uses KAIO (if the file system supports it) with direct I/O by default.
v Does not use KAIO with direct I/O if the environment variable KAIOOFF is set.
If Informix uses direct I/O for a chunk, and another program tries to open the
chunk file without using direct I/O, the open will normally succeed, but there can
be a performance penalty. The penalty can occur because the file system attempts
to ensure that each open sees the same file data, either by switching to buffered
I/O and not using direct I/O for the duration of the conflicting open, or by
flushing the file system cache before each direct I/O operation and invalidating the
file system cache after each direct write.
Concurrent I/O can be especially beneficial when you have data in a single chunk
file striped across multiple disks.
If Informix uses concurrent I/O for a chunk, and another program (such as an
external backup program) tries to open the same chunk file without using
concurrent I/O, the open operation will fail.
Informix does not use direct or concurrent I/O for cooked files used for temporary
dbspace chunks.
Related reference:
DIRECT_IO configuration parameter (UNIX) (Administrator's Reference)
Prerequisites:
v You must log on as user root or informix.
v Direct I/O or concurrent I/O must be available and the file system must
support direct I/O for the page size used for the dbspace chunk.
To enable concurrent I/O with direct I/O on AIX operating systems, set the
DIRECT_IO configuration parameter to 2.
If you do not want to enable direct I/O or concurrent I/O, set the DIRECT_IO
configuration parameter to 0.
Related reference:
DIRECT_IO configuration parameter (UNIX) (Administrator's Reference)
You can confirm the use of direct I/O or concurrent I/O by:
v Displaying onstat -d information.
The onstat -d command displays information that includes a flag that identifies
whether direct I/O, concurrent I/O (on AIX), or neither is used for cooked file
chunks.
v Verifying that the DIRECT_IO configuration parameter is set to 1 (for direct I/O)
or 2 (for concurrent I/O).
To arrive at an appropriate placement strategy for critical data, you must make a
trade-off between the availability of data and maximum logging performance.
The database server also places temporary table and sort files in the root dbspace
by default. You should use the DBSPACETEMP configuration parameter and the
DBSPACETEMP environment variable to assign these tables and files to other
dbspaces. For details, see “Configure dbspaces for temporary tables and sort files”
on page 5-8.
The database server uses different methods to configure various portions of critical
data. To assign an appropriate dbspace for the root dbspace and physical log, set
the appropriate database server configuration parameters. To assign the logical-log
files to an appropriate dbspace, use the onparams utility.
For more information about the configuration parameters that affect each portion
of critical data, see “Configuration parameters that affect critical data” on page 5-8.
Most modern storage devices have excellent mirroring capabilities, and you can
use those devices instead of the mirroring capabilities of the database server.
When mirroring is in effect, two disks are available to handle read requests, and
the database server can process a higher volume of those requests. However, each
write request requires two physical write operations and does not complete until
both physical operations are performed. The write operations are performed in
parallel, but the request does not complete until the slower of the two disks
performs the update. Thus, you experience a slight performance penalty when you
mirror write-intensive dbspaces.
When you place update-intensive tables in other, nonmirrored dbspaces, you can
use the database server backup-and-restore facilities to perform warm restores of
those tables in the event of a disk failure. When the root dbspace is mirrored, the
database server remains online to service other transactions while the failed disk is
being repaired.
When you mirror the root dbspace, always place the first chunk on a different
device than that of the mirror. The MIRRORPATH configuration parameter should
have a different value than ROOTPATH.
Related reference:
MIRRORPATH configuration parameter (Administrator's Reference)
ROOTPATH configuration parameter (Administrator's Reference)
An sbspace is a logical storage unit composed of one or more chunks that store
smart large objects, which consist of CLOB (character large object) or BLOB (binary
large object) data.
The first chunk of an sbspace contains a special set of pages, called metadata, which
is used to locate smart large objects in the sbspace. Additional chunks that are
added to the sbspace can also have metadata pages if you specify them on the
onspaces command when you create the chunk.
Consider mirroring chunks that contain metadata pages for the following reasons:
v Higher availability
Without access to the metadata pages, users cannot access any smart large
objects in the sbspace. If the first chunk of the sbspace contains all of the
metadata pages and the disk that contains that chunk becomes unavailable, you
cannot access a smart large object in the sbspace, even if it resides on a chunk on
With unbuffered and ANSI-compliant logging, the database server requests a flush
of the log buffer to disk for every committed transaction (two when the dbspace is
mirrored). Buffered logging generates far fewer I/O requests than unbuffered or
ANSI-compliant logging.
With buffered logging, the log buffer is written to disk only when it fills and all
the transactions that it contains are completed. You can reduce the frequency of
logical-log I/O even more if you increase the size of your logical-log buffers.
However, buffered logging leaves transactions in any partially filled buffers
vulnerable to loss in the event of a system failure.
Unlike the physical log, you cannot specify an alternative dbspace for logical-log
files in your initial database server configuration. Instead, use the onparams utility
first to add logical-log files to an alternative dbspace and then drop logical-log files
from the root dbspace.
Related reference:
The onparams Utility (Administrator's Reference)
To keep I/O to the physical log at a minimum, you can adjust the checkpoint
interval and the LRU minimum and maximum thresholds. (See “CKPTINTVL and
its effect on checkpoints” on page 5-31 and “BUFFERPOOL and its effect on page
cleaning” on page 5-40.)
You can use the following configuration parameters to configure the root dbspace:
v ROOTNAME
v ROOTOFFSET
v ROOTPATH
v ROOTSIZE
v MIRROR
v MIRRORPATH
v MIRROROFFSET
These parameters determine the location and size of the initial chunk of the root
dbspace and configure mirroring, if any, for that chunk. (If the initial chunk is
mirrored, all other chunks in the root dbspace must also be mirrored). Otherwise,
these parameters have no major impact on performance.
LOGSIZE determines the size of each logical-log files. LOGBUFF determines the
size of the three logical-log buffers that are in shared memory. For more
information about LOGBUFF, see “The LOGBUFF configuration parameter and
memory utilization” on page 4-15.
The following configuration parameters determine the location and size of the
physical log:
v PHYSDBS
v PHYSFILE
Depending on how the temporary space is created, the database server uses the
following default locations for temporary table and sort files when you do not set
DBSPACETEMP:
v The dbspace of the current database, when you create an explicit temporary
table with the TEMP TABLE clause of the CREATE TABLE statement and do not
specify a dbspace for the table either in the IN dbspace clause or in the
FRAGMENT BY clause
This action can severely affect I/O to that dbspace. If the root dbspace is
mirrored, you encounter a slight double-write performance penalty for I/O to
the temporary tables and sort files.
v The root dbspace when you create an explicit temporary table with the INTO
TEMP option of the SELECT statement
You can improve performance with the use of temporary dbspaces that you create
exclusively to store temporary tables and sort files. Use the DBSPACETEMP
configuration parameter and the DBSPACETEMP environment variable to assign
these tables and files to temporary dbspaces.
The following table shows statements that create temporary tables and information
about where the temporary tables are created.
To create a dbspace for the exclusive use of temporary tables and sort files, use
onspaces -t. For best performance, use the following guidelines:
v If you create more than one temporary dbspace, create each dbspace on a
separate disk to balance the I/O impact.
v Place no more than one temporary dbspace on a single disk.
You cannot mirror a temporary dbspace that you create with onspaces -t.
Important: In the case of a database with logging, you must include the WITH NO
LOG clause in the SELECT... INTO TEMP statement to place the explicit temporary
tables in the dbspaces listed in the DBSPACETEMP configuration parameter and
the DBSPACETEMP environment variable. Otherwise, the database server stores
the explicit temporary tables in the root dbspace.
Related reference:
If the database server inserts data into a temporary table through a SELECT INTO
TEMP operation that creates the TEMP table, that temporary table uses
round-robin distributed storage. Its fragments are created in the temporary
dbspaces that are listed in the DBSPACETEMP configuration parameter or in the
DBSPACETEMP environment variable. For example, the following query uses
round-robin distributed storage:
SELECT col1 FROM tab1
INTO TEMP temptab1 WITH NO LOG;
You can use the following guidelines to estimate the amount of temporary space to
allocate:
v For OLTP applications, allocate temporary dbspaces that equal at least 10 percent
of the table.
v For DSS applications, allocate temporary dbspaces that equal at least 50 percent
of the table.
A hash join, which works by building a table (the hash table) from the rows in one
of the tables in a join, and then probing it with rows from the other table, can use
a significant amount of memory and can potentially overflow to temporary space
on disk. The hash table size is governed by the size of the table used to build the
hash table (which is often the smaller of the two tables in the join), after applying
any filters, which can reduce the number of rows and possibly reduce the number
of columns.
Hash-join partitions are organized into pages. Each page has a header. The header
and tuples are larger in databases on 64-bit platforms than in builds on 32-bit
platforms. The size of each page is the base page size (2K or 4K depending on
system) unless a single row needs more space. If you need more space, you can
add bytes to the length of your rows.
You can use the following formula to estimate the amount of memory that is
required for the hash table in a hash join:
hash_table_size = (32 bytes + row_size_smalltab) * num_rows_smalltab
where row_size_smalltab and num_rows_smalltab refer to the row size and the
number of rows, respectively, in the smaller of the two tables participating in the
hash join.
If the value of rows_per_page is less than one, increase the page_size value to the
smallest multiple of the base_page_size, as shown in this formula:
size = (numrows_smalltab / rows_per_page) * page_size
For more information, see “Hash join” on page 10-3 and “Configuring memory for
queries with hash joins, aggregates, and other memory-intensive elements” on
page 13-34.
Related reference:
DS_NONPDQ_QUERY_MEM configuration parameter (Administrator's
Reference)
When the value of PDQ priority is 0 and PSORT_NPROCS is greater than 1, the
database server uses parallel sorts. The management of PDQ does not limit these
sorts. In other words, although the sort is executed in parallel, the database server
does not regard sorting as a PDQ activity. When PDQ priority is 0, the database
server does not control sorting by any of the PDQ configuration parameters.
When PDQ priority is greater than 0 and PSORT_NPROCS is greater than 1, the
query benefits both from parallel sorts and from PDQ features such as parallel
scans and additional memory. Users can use the PDQPRIORITY environment
variable to request a specific proportion of PDQ resources for a query. You can use
the MAX_PDQPRIORITY parameter to limit the number of such user requests. For
more information about MAX_PDQPRIORITY, see “Limiting PDQ resources in
queries” on page 3-11.
The database server allocates a relatively small amount of memory for sorting, and
that memory is divided among the PSORT_NPROCS sort threads. Sort processes
use temporary space on disk when not enough memory is allocated. For more
information about memory allocated for sorting, see “Estimating memory needed
for sorting” on page 7-19.
For more information about sorts during index builds, see “Improving
performance for index builds” on page 7-18.
Important: In the case of a database with logging, you must include the WITH NO
LOG clause in the SELECT... INTO TEMP statement to place the temporary smart
large objects in the sbspaces listed in the SBSPACETEMP configuration parameter.
Otherwise, the database server stores the temporary smart large objects in the
sbspace listed in the SBSPACENAME configuration parameter.
Related tasks:
Creating a temporary sbspace (Administrator's Guide)
Related reference:
onspaces -c -S: Create an sbspace (Administrator's Reference)
A blobspace is a logical storage unit composed of one or more chunks that store
only simple large objects (TEXT or BYTE data). For information about sbspaces,
which store smart large objects (such as BLOB, CLOB, or multirepresentational
data), see “Factors that affect I/O for smart large objects” on page 5-20.
If you use a blobspace, you can store simple large objects on a separate disk from
the table with which the data is associated. You can store simple large objects
associated with different tables in the same blobspace.
You can create a blobspace with the onspaces utility or with an SQL administration
API command that uses the create blobspace argument with the admin() or task()
function.
You assign simple large objects to a blobspace when you create the tables with
which simple large objects are associated, using the CREATE TABLE statement.
Simple large objects are not logged and do not pass through the buffer pool.
However, frequency of checkpoints can affect applications that access TEXT or
BYTE data. For more information, see “LOGSIZE and LOGFILES and their effect
on checkpoints” on page 5-32.
Related reference:
For more information, see “Storing simple large objects in the tblspace or a
separate blobspace” on page 6-8.
The optimal blobpage size for your configuration depends on the following factors:
v The size distribution of the simple large objects
v The trade-off between retrieval speed for your largest simple large object and the
amount of disk space that is wasted by storing simple large objects in large
blobpages
To retrieve simple large objects as quickly as possible, use the size of your largest
simple large object rounded up to the nearest disk-page-sized increment. This
scheme guarantees that the database server can retrieve even the largest simple
large object in a single I/O request. Although this scheme guarantees the fastest
retrieval, it has the potential to waste disk space. Because simple large objects are
stored in their own blobpage (or set of blobpages), the database server reserves the
same amount of disk space for every blobpage even if the simple large object takes
up a fraction of that page. Using a smaller blobpage allows you to make better use
of your disk, especially when large differences exist in the sizes of your simple
large objects.
To achieve the greatest theoretical utilization of space on your disk, you can make
your blobpage the same size as a standard disk page. Then many, if not most,
simple large objects would require several blobpages. Because the database server
acquires a lock and issues a separate I/O request for each blobpage, this scheme
performs poorly.
In practice, a balanced scheme for sizing uses the most frequently occurring
simple-large-object size as the size of a blobpage. For example, suppose that you
have 160 simple-large-object values in a table with the following size distribution:
Tip: If a table has more than one simple-large-object column and the data values are not
close in size, store the data in different blobspaces, each with an appropriately sized
blobpage.
Blobpage fullness refers to the amount of data within each blobpage. TEXT and
BYTE data stored in a blobspace cannot share blobpages. Therefore, if a single
simple large object requires only 20 percent of a blobpage, the remaining 80
percent of the page is unavailable for use.
However, avoid making the blobpages too small. When several blobpages are
needed to store each simple large object, you increase the overhead cost of storage.
For example, more locks are required for updates, because a lock must be acquired
for each blobpage.
The command lists the following statistics for each table (or database):
v The number of blobpages used by the table (or database) in each blobspace
v The average fullness of the blobpages used by each simple large object stored as
part of the table (or database)
If you find that the statistics for a significant number of simple large objects show
a low percentage of fullness, the database server might benefit from changing the
size of the blobpage in the blobspace.
Both the oncheck -pB and onstat -d update commands display the same
information about the number of free blobpages. The onstat -d update command
displays the same information as onstat -d and an accurate number of free
blobpages for each blobspace chunk.
Execute oncheck -pB with either a database name or a table name as a parameter.
The following example retrieves storage information for all simple large objects
stored in the table sriram.catalog in the stores_demo database:
oncheck -pB stores_demo:sriram.catalog
BLOBSpace usage:
Space Page Percent Full
Name Number Pages 0-25% 26-50% 51-75 76-100%
-------------------------------------------------------------
blobPIC 0x300080 1 x
blobPIC 0x300082 2 x
------
Page Size is 6144 3
bspc1 0x2000b2 2 x
bspc1 0x2000b6 2 x
------
Page Size is 2048 4
Space Name is the name of the blobspace that contains one or more simple large
objects stored as part of the table (or database).
Page Number is the starting address in the blobspace of a specific simple large
object.
Pages is the number of the database server pages required to store this simple
large object.
Percent Full is a measure of the average blobpage fullness, by blobspace, for each
blobspace in this table or database.
Page Size is the size in bytes of the blobpage for this blobspace. Blobpage size is
always a multiple of the database server page size.
The summary information that appears at the top of the display, Total pages used
by table is a simple total of the blobpages needed to store simple large objects. The
total says nothing about the size of the blobpages used, the number of simple large
objects stored, or the total number of bytes stored.
The efficiency information displayed under the Percent Full heading is imprecise,
but it can alert an administrator to trends in the storage of TEXT and BYTE data.
You can analyze the output of the oncheck -pB command to calculate average
fullness.
The first simple large object listed in “Determine blobpage fullness with oncheck
-pB output” on page 5-18 is stored in the blobspace blobPIC and requires one
6144-byte blobpage. The blobpage is 51 to 75 percent full, meaning that the size is
between 0.51 * 6144 = 3133 bytes and 0.75 * 6144 = 4608. The maximum size of this
simple large object must be less than or equal to 75 percent of 6144 bytes, or 4608
bytes.
The second object listed under blobspace blobPIC requires two 6144-byte
blobpages for storage, or a total of 12,288 bytes. The average fullness of all
allocated blobpages is 51 to 75 percent. Therefore, the minimum size of the object
must be greater than 50 percent of 12,288 bytes, or 6144 bytes. The maximum size
of the simple large object must be less than or equal to 75 percent of 12,288 bytes,
or 9216 bytes. The average fullness does not mean that each page is 51 to 75
percent full. A calculation would yield 51 to 75 percent average fullness for two
blobpages where the first blobpage is 100 percent full and the second blobpage is 2
to 50 percent full.
Now consider the two simple large objects in blobspace bspc1. These two objects
appear to be nearly the same size. Both objects require two 2048-byte blobpages,
and the average fullness for each is 76 to 100 percent. The minimum size for these
simple large objects must be greater than 75 percent of the allocated blobpages, or
3072 bytes. The maximum size for each object is slightly less than 4096 bytes
(allowing for overhead).
You can analyze the output of the oncheck -pB command to determine if there is a
more efficient storage strategy.
Looking at the efficiency information for that is shown for blobspace bspc1 in
Figure 5-1 on page 5-18, a database server administrator might decide that a better
storage strategy for TEXT and BYTE data would be to double the blobpage size
from 2048 bytes to 4096 bytes. (Blobpage size is always a multiple of the database
server page size.) If the database server administrator made this change, the
measure of page fullness would remain the same, but the number of locks needed
during an update of a simple large object would be reduced by half.
The DataBlade® API and the Informix ESQL/C application programming interface
also provide functions that affect I/O operations for smart large objects.
Important: For most applications, you should use the values that the database
server calculates for the disk-storage information.
Related concepts:
Sbspaces (Administrator's Guide)
Related reference:
What is Informix ESQL/C? (ESQL/C Guide)
DataBlade API overview (DataBlade API Guide)
To create an sbspace, use the onspaces utility. You assign smart large objects to an
sbspace when you use the CREATE TABLE statement to create the tables with
which the smart large objects are associated.
Related reference:
onspaces -c -S: Create an sbspace (Administrator's Reference)
CREATE TABLE statement (SQL Syntax)
By default, the database server reads smart large objects into the buffers in the
resident portion of shared memory. For more information on using lightweight I/O
buffers, see “Lightweight I/O for smart large objects” on page 5-23.
If you log smart-large-object user data, increase the size of your logical-log buffer
to prevent frequent flushing to these log files on disk.
Related reference:
SBSPACENAME configuration parameter (Administrator's Reference)
BUFFERPOOL configuration parameter (Administrator's Reference)
LOGBUFF configuration parameter (Administrator's Reference)
Sbspace extents
As you add smart large objects to a table, the database server allocates disk space
to the sbspace in units called extents. Each extent is a block of physically
contiguous pages from the sbspace.
Even when the sbspace includes more than one chunk, each extent is allocated
entirely within a single chunk so that it remains contiguous. Contiguity is
important to I/O performance.
When the pages of data are contiguous, disk-arm motion is minimized when the
database server reads the rows sequentially. The mechanism of extents is a
compromise between the following competing requirements:
v The size of some smart large objects is not known in advance.
v The number of smart large objects in different tables can grow at different times
and different rates.
v All the pages of a single smart large object should ideally be adjacent for best
performance when you retrieve the entire object.
Because you might not be able to predict the number and size of smart large
objects, you cannot specify the extent length of smart large objects. Therefore, the
These functions are the best way to set the extent size because they reduce the
number of extents in a smart large object. The database server tries to allocate the
entire smart large object as one extent (if an extent of that size is available in the
chunk).
v The EXTENT_SIZE flag in the -Df option of the onspaces command when you
create or alter the sbspace
Most administrators do not use the onspaces EXTENT_SIZE flag because the
database server calculates the extent size from heuristics. However, you might
consider using the onspaces EXTENT_SIZE flag in the following situations:
– Many one-page extents are scattered throughout the sbspace.
– Almost all smart large objects are the same length.
v The EXTENT SIZE keyword of the CREATE TABLE statement when you define
the CLOB or BLOB column
Most administrators do not use the EXTENT SIZE keyword when they create or
alter a table because the database server calculates the extent size from
heuristics. However, you might consider using this EXTENT SIZE keyword if
almost all smart large objects are the same length.
Important: For most applications, you should use the values that the database
server calculates for the extent size. Do not use the DataBlade API
mi_lo_specset_extsz function or the Informix ESQL/C ifx_lo_specset_extsz
function to set the extent size of the smart large object.
If you know the size of the smart large object, it is recommended that you specify
the size in the DataBlade API mi_lo_specset_estbytes() function or Informix
ESQL/C ifx_lo_specset_estbytes() function instead of in the onspaces utility or the
CREATE TABLE or the ALTER TABLE statement. These functions are the best way
to set the extent size because the database server allocates the entire smart large
object as one extent (if it has contiguous storage in the chunk).
Extent sizes over one megabyte do not provide much I/O benefit because the
database server performs read and write operations in multiples of 60 kilobytes at
For more information, see “Improving metadata I/O for smart large objects” on
page 6-12.
By default, smart large objects pass through the buffer pool in the resident portion
of shared memory. Although smart large objects have lower priority than other
data, the buffer pool can become full when an application accesses many smart
large objects. A single application can fill the buffer pool with smart large objects
and leave little room for data that other applications might need. In addition, when
the database server performs scans of many pages into the buffer pool, the
overhead and contention associated with checking individual pages in and out
might become a bottleneck.
Important: Use private buffers only when you read or write smart large objects in
read or write operations greater than 8080 bytes and you seldom access them. That
is, if you have infrequent read or write function calls that read large amounts of
data in a single function invocation, lightweight I/O can improve I/O
performance.
Related concepts:
“The BUFFERPOOL configuration parameter and memory utilization” on page
4-10
When you use lightweight I/O buffers for smart large objects, the database server
might read several pages with one I/O operation. A single I/O operation reads in
several smart-large-object pages, up to the size of an extent. For information about
when to specify extent size, see “Sbspace extents” on page 5-21.
To specify the use of lightweight I/O when creating the sbspace, use the
BUFFERING tag of the -Df option in the onspaces -c -S command.
The default value for BUFFERING is ON, which means to use the buffer pool. The
buffering mode that you specify (or the default, if you do not specify) in the
onspaces command is the default buffering mode for all smart large objects stored
within the sbspace.
Important: In general, if read and write operations to the smart large objects are
less than 8080 bytes, do not specify a buffering mode when you create the sbspace.
If you are reading or writing short blocks of data, such as 2 kilobytes or 4
kilobytes, leave the default of “buffering=ON” to obtain better performance.
Programmers can override the default buffering mode when they create, open, or
alter a smart-large-object instance with DataBlade API and the Informix ESQL/C
functions. The DataBlade API and the Informix ESQL/C application programming
interface provide the LO_NOBUFFER flag to allow lightweight I/O for smart large
objects.
Important: Use the LO_NOBUFFER flag only when you read or write smart large
objects in operations greater than 8080 bytes and you seldom access them. That is,
if you have infrequent read or write function calls that read large amounts of data
in a single function invocation, lightweight I/O can improve I/O performance.
Related reference:
onspaces -c -S: Create an sbspace (Administrator's Reference)
What is Informix ESQL/C? (ESQL/C Guide)
DataBlade API overview (DataBlade API Guide)
Logging
If you decide to log all write operations on data stored in sbspaces, logical-log I/O
activity and memory utilization increases.
For more information, see “Configuration parameters that affect sbspace I/O” on
page 5-20.
The memory cache is a common storage area. The database server adds simple
large objects requested by any application to the memory cache if the cache has
space. To free space in the memory cache, the application must release the TEXT or
BYTE data that it is using.
A significant performance advantage occurs when you retrieve TEXT or BYTE data
directly into memory instead of buffering that data on disk. Therefore, proper
cache sizing is important when you use the Optical Subsystem. You specify the
total amount of space available in the memory cache with the OPCACHEMAX
configuration parameter. Applications indicate that they require access to a portion
Simple large objects that cannot fit entirely into the space that remains in the cache
are stored in the blobspace that the STAGEBLOB configuration parameter names.
This staging area acts as a secondary cache on disk for blobpages that are retrieved
from the Optical Subsystem. Simple large objects that are retrieved from the
Optical Subsystem are held in the staging area until the transactions that requested
them are complete.
The database server administrator creates the staging-area blobspace with the
onspaces utility or with ON-Monitor (UNIX only.
You can use onstat -O to monitor utilization of the memory cache and
STAGEBLOB blobspace. If contention develops for the memory cache, increase the
value listed in the configuration file for OPCACHEMAX. (The new value takes
effect the next time that the database server starts shared memory.) For a complete
description of the Optical Subsystem, see the IBM Informix Optical Subsystem Guide.
If the configuration file does not list the STAGEBLOB parameter, the Optical
Subsystem does not recognize the optical-storage subsystem.
The structure of the staging-area blobspace is the same as all other database server
blobspaces. When the database server administrator creates the staging area, it
consists of only one chunk, but you can add more chunks as desired. You cannot
mirror the staging-area blobspace. The optimal size for the staging-area blobspace
depends on the following factors:
v The frequency of simple-large-object storage
v The frequency of simple-large-object retrieval
v The average size of the simple large object to be stored
To calculate the size of the staging-area blobspace, you must estimate the number
of simple large objects that you expect to reside there simultaneously and multiply
that number by the average simple-large-object size.
Related reference:
STAGEBLOB configuration parameter (Administrator's Reference)
Until the memory cache fills, it stores simple large objects that are requested by
any application. Simple large objects that cannot fit in the cache are stored on disk
in the blobspace that the STAGEBLOB configuration parameter indicates. You can
increase the size of the cache to reduce contention among simple-large-object
requests and to improve performance for requests that involve the Optical
Subsystem.
Related reference:
OPCACHEMAX configuration parameter (UNIX) (Administrator's Reference)
If the value of this variable exceeds the maximum that the OPCACHEMAX
configuration parameter specifies, OPCACHEMAX is used instead. If
INFORMIXOPCACHE is not set in the environment, the cache size is set to
OPCACHEMAX by default.
Related reference:
INFORMIXOPCACHE environment variable (SQL Reference)
Table I/O
One of the most frequent functions that the database server performs is to bring
data and index pages from disk into memory. Pages can be read individually for
brief transactions and sequentially for some queries. You can configure the number
of pages that the database server brings into memory, and you can configure the
timing of I/O requests for sequential scans.
You can also indicate how the database server is to respond when a query requests
data from a dbspace that is temporarily unavailable.
For information about I/O for smart large objects, see “Factors that affect I/O for
smart large objects” on page 5-20.
Sequential scans
When the database server performs a sequential scan of data or index pages, most
of the I/O wait time is caused by seeking the appropriate starting page. To
dramatically improve performance for sequential scans, you can bring in a number
of contiguous pages with each I/O operation.
The action of bringing additional pages along with the first page in a sequential
scan is called read ahead.
Light scans
Some sequential scans of tables can use light scans to read the data. A light scan
bypasses the buffer pool by utilizing session memory to read directly from disk.
Light scans can provide performance advantages over use of the buffer pool for
sequential scans and skip scans of large tables. These advantages include:
v Bypassing the overhead of the buffer pool when many data pages are read
v Preventing frequently accessed pages from being forced out of the buffer pool
when many sequential pages are read for a single query.
Light scans are only performed on user tables whose data rows are stored in
tblspaces. Light scans are not used to access indexes, or to access data stored in
blobspaces, smart blob spaces, or partition blobs. Similarly, light scans are not used
to access data in the system catalog tables, nor in the tables and pseudotables of
system databases like sysadmin, sysmaster, sysuser, and sysutils.
If you have a long-running scan, you can view output from the onstat -g scn
command to check the progress of the scan, to determine how long the scan will
take before it completes, and to see whether the scan is a light scan or a bufferpool
scan.
The following example shows some of the output from onstat -g scn for a light
scan. The word Light in the Scan Type field identifies the scan as a light scan.
SesID Thread Partnum Rowid Rows Scan’d Scan Type Lock Mode Notes
17 48 300002 207 15 Light Forward row lookup
Related reference:
BATCHEDREAD_TABLE configuration parameter (Administrator's Reference)
onstat -g scn command: Print scan information (Administrator's Reference)
Unavailable data
Another aspect of table I/O pertains to situations in which a query requests access
to a table or fragment in a dbspace that is temporarily unavailable. When the
database server determines that a dbspace is unavailable as the result of a disk
failure, queries directed to that dbspace fail by default. The database server allows
you to specify dbspaces that, when unavailable, can be skipped by queries,
For information about specifying dbspaces that, when unavailable, can be skipped
by queries, see “How DATASKIP affects table I/O” on page 5-29.
When data skipping is enabled, the database server sets the sixth character in the
SQLWARN array to W..
Warning: The database server cannot determine whether the results of a query are
consistent when a dbspace is skipped. If the dbspace contains a table fragment, the
user who executes the query must ensure that the rows within that fragment are
not needed for an accurate query result. Turning DATASKIP on allows queries
with incomplete data to return results that can be inconsistent with the actual state
of the database. Without proper care, that data can yield incorrect or misleading
query results.
Related concepts:
SQLWARN array (SQL Tutorial)
Related reference:
DATASKIP Configuration Parameter (Administrator's Reference)
These overhead activities take time away from queries and transactions. If you do
not configure background I/O activities properly, too much overhead for these
activities can limit the transaction throughput of your application.
For the most part, tuning your background I/O activities involves striking a
balance between appropriate checkpoint intervals, logging modes and log sizes,
and page-cleaning thresholds. The thresholds and intervals that trigger background
I/O activity often interact; adjustments to one threshold might shift the
performance bottleneck to another.
The database server prints warning messages in the message log if the server
cannot meet the RTO_SERVER_RESTART policy.
Related reference:
RTO_SERVER_RESTART configuration parameter (Administrator's Reference)
You can turn off automatic checkpoint tuning by setting onmode -wf
AUTO_CKPTS to 0, or setting the AUTO_CKPTS configuration parameter to 0.
Because the database server does not block transactions during checkpoint
processing, LRU flushing is relaxed. If the server is not able to complete checkpoint
processing before the physical log is full (which causes transaction blocking), and if
you cannot increase the size of the physical log, you can configure the server for
Automatic LRU tuning affects all buffer pools and adjusts lru_min_dirty and
lru_max_dirty values in the BUFFERPOOL configuration parameter.
Related concepts:
“LRU tuning” on page 5-45
Related reference:
AUTO_CKPTS configuration parameter (Administrator's Reference)
AUTO_AIOVPS configuration parameter (Administrator's Reference)
BUFFERPOOL configuration parameter (Administrator's Reference)
The database server can skip a checkpoint if all data is physically consistent when
the checkpoint interval expires.
If you set CKPTINTVL to a long interval, you can use physical-log capacity to
trigger checkpoints based on actual database activity instead of an arbitrary time
unit. However, a long checkpoint interval can increase the time needed for
recovery in the event of a failure. Depending on your throughput and
data-availability requirements, you can choose an initial checkpoint interval of 5,
10, or 15 minutes, with the understanding that checkpoints might occur more
often, depending on physical-logging activity.
Chapter 5. Effect of configuration on I/O activity 5-31
The database server writes a message to the message log to note the time that it
completes a checkpoint. To read these messages, use onstat -m.
Related reference:
CKPTINTVL configuration parameter (Administrator's Reference)
If you need to free the logical-log file that contains the last checkpoint, the
database server must write a new checkpoint record to the current logical-log file.
If the frequency with which logical-log files are backed up and freed increases, the
frequency at which checkpoints occur increases. Although checkpoints block user
processing, they no longer last as long. Because other factors (such as the
physical-log size) also determine the checkpoint frequency, this effect might not be
significant.
When the dynamic log allocation feature is enabled, the size of the logical log does
not affect the thresholds for long transactions as much as it did in previous
versions of the database server. For details, see “LTXHWM and LTXEHWM and
their effect on logging” on page 5-38.
The rate at which transactions generate physical log activity can affect checkpoint
performance. To avoid transaction blocking during checkpoint processing, consider
the size of the physical log and how quickly it fills.
Similarly, you can define a smaller physical log if your application updates the
same pages. The database server writes the before-image of only the first update
that is made to a page for the following operations:
v Inserts, updates, and deletes for rows that contain user-defined data types
(UDTs), smart large objects, and simple large objects
v ALTER statements
5-32 IBM Informix Performance Guide
v Operations that create or modify indexes (B-tree, R-tree, or user-defined indexes)
Because the physical log is recycled after each checkpoint, the physical log must be
large enough to hold before-images from changes between checkpoints. If the
database server frequently triggers checkpoints because it runs out of physical log
space, consider increasing the size of the physical log.
You can use the onparams utility to change the physical log location and size. You
can change the physical log while transactions are active and without restarting the
database server.
Related concepts:
Strategy for estimating the size of the physical log (Administrator's Guide)
Related reference:
PHYSFILE configuration parameter (Administrator's Reference)
Change the physical-log location and size (Administrator's Guide)
To restore access to that database, you must back up all logical logs and then
perform a warm restore on the down dbspace.
The database server halts operation whenever a disabling I/O error occurs on a
nonmirrored dbspace that contains critical data, regardless of the setting for
ONDBSPACEDOWN. In such an event, you must perform a cold restore of the
database server to resume normal database operations.
When you set ONDBSPACEDOWN to 1, the database server treats all dbspaces as
though they were critical. Any nonmirrored dbspace that becomes disabled halts
normal processing and requires a cold restore. The performance impact of halting
and performing a cold restore when any dbspace goes down can be severe.
When you initialize or restart the database server, it creates the number of
logical-log files that you specify in the LOGFILES configuration parameter.
If all of your logical log files are the same size, you can calculate the total space
allocated to the logical log files.
To calculate the space allocated to these files, use the following formula:
total logical log space = LOGFILES * LOGSIZE
If you add logical-log files that are not the size specified by the LOGSIZE
configuration parameter, you cannot use the LOGFILES * LOGSIZE expression to
calculate the size of the logical log. Instead, you need to add the sizes for each
individual log file on disk.
The size of the logical log space (LOGFILES * LOGSIZE) is determined by these
policies:
Recovery time objective (RTO)
This is the length of time you can afford to be without your systems. If
your only objective is failure recovery, the total log space only needs to be
large enough to contain all the transactions for two checkpoint cycles.
When the RTO_SERVER_RESTART configuration parameter is enabled and
the server has a combined buffer pool size of less that four gigabytes, you
can configure the total log space to 110% of the combined buffer pool sizes.
Too much log space does not impact performance; however, too little log
space can cause more frequent checkpoints and transaction blocking.
Recovery point objective (RPO)
This describes the age of the data you want to restore in the event of a
disaster. If the objective is to make sure transactional work is protected, the
optimum LOGSIZE should be a multiple of how much work gets done per
RPO unit. Because the database server supports partial log backup, an
optimal log size is not critical and a non-optimal log size simply means
more frequent log file changes. RPO is measured in units of time. If the
business rule is that the system cannot lose more than ten minutes of
transactional data if a complete site disaster occurs, then a log backup
should occur every ten minutes.
You can use the Scheduler, which manages and executes scheduled
administrative tasks, to set up automatic log backup.
Long Transactions
If you have long transactions that require a large amount of log space, you
should allocate that space for the logs. Inadequate log space impacts
transaction performance.
Choose a log size based on how much logging activity occurs and the amount of
risk in case of catastrophic failure. If you cannot afford to lose more than an hour's
worth of data, create many small log files that each hold an hour's worth of
transactions. Turn on continuous-log backup. Small logical-log files fill sooner,
which means more frequent logical-log backups.
The backup process can hinder transaction processing that involves data located on
the same disk as the logical-log files. If enough logical-log disk space is available,
however, you can wait for periods of low user activity before you back up the
logical-log files.
Related concepts:
The Scheduler (Administrator's Guide)
Related reference:
LOGSIZE configuration parameter (Administrator's Reference)
Use the following formula to obtain an initial estimate for LOGSIZE in kilobytes:
LOGSIZE = (connections * maxrows * rowsize) / 1024) / LOGFILES
In this formula:
v connections is the maximum number of connections for all network types
specified in the sqlhosts information by one or more NETTYPE parameters. If
you configured more than one connection by setting multiple NETTYPE
configuration parameters in your configuration file, sum the users fields for each
NETTYPE parameter, and substitute this total for connections in the preceding
formula.
v maxrows is the largest number of rows to be updated in a single transaction.
v rowsize is the average size of a row in bytes. You can calculate rowsize by adding
up the length (from the syscolumns system catalog table) of the columns in a
row.
v 1024 is a necessary divisor because you specify LOGSIZE in kilobytes.
To obtain a better estimate during peak activity periods, execute the onstat -u
command. The last line of the onstat -u output contains the maximum number of
concurrent connections.
You need to adjust the size of the logical log when your transactions include
simple large objects or smart large objects, as the following sections describe.
You also can increase the amount of space devoted to the logical log by adding
another logical-log file.
Related tasks:
Adding logical-log files manually (Administrator's Guide)
To obtain better overall performance for applications that perform frequent updates
of TEXT or BYTE data in blobspaces, reduce the size of the logical log.
When you use volatile blobpages in blobspaces, smaller logs can improve access to
simple large objects that must be reused. Simple large objects cannot be reused
until the log in which they are allocated is flushed to disk. In this case, you can
justify the cost in performance because those smaller log files are backed up more
frequently.
If you plan to log smart-large-object user data, you must ensure that the log size is
considerably larger than the amount of data being written. Smart-large-object
metadata is always logged even if the smart large objects are not logged.
Use the following guidelines when you log smart large objects:
v If you are appending data to a smart large object, the increased logging activity
is roughly equal to the amount of data written to the smart large object.
v If you are updating a smart large object (overwriting data), the increased logging
activity is roughly twice the amount of data written to the smart large object.
The database server logs both the before-image and after-image of a smart large
object for update transactions. When updating the smart large objects, the
database server logs only the updated parts of the before and after image.
v Metadata updates affect logging less. Even though metadata is always logged,
the number of bytes logged is usually much smaller than the smart large objects.
When you use the default value of 2 for DYNAMIC_LOGS, the database server
determines the location and size of the new logical log for you:
v The database server uses the following criteria to determine on which disk to
allocate the new log file:
– Favor mirrored dbspaces
– Avoid root dbspace until no other critical dbspace is available
– Least favored space is unmirrored and noncritical dbspaces
v The database server uses the average size of the largest log file and the smallest
log file for the size of the new logical log file. If not enough contiguous disk
space is available for this average size, the database server searches for space for
the next smallest average size. The database server allocates a minimum of 200
kilobytes for the new log file.
If you want to control the location and size of the additional log file, set
DYNAMIC_LOGS to 1. When the database server switches log files, it still checks
if the next active log contains an open transaction. If it does find an open
transaction in the next log to be active, it does the following actions:
v Issues alarm event 27 (log required)
v Writes a warning message to the online log
v Pauses to wait for the administrator to manually add a log with the onparams -a
-i command-line option
You can write a script that will execute when alarm event 27 occurs to execute
onparams -a -i with the location you want to use for the new log. Your script can
also execute the onstat -d command to check for adequate space and execute the
onparams -a -i command with the location that has enough space. You must use
the -i option to add the new log right after the current log file.
If you set DYNAMIC_LOGS to 0, the database server still checks whether the next
active log contains an open transaction when it switches log files. If it does find an
open transaction in the next log to be active, it issues the following warning:
WARNING: The oldest logical log file (%d) contains records
from an open transaction (0x%p), but the Dynamic Log
Files feature is turned off.
Related concepts:
Fast recovery (Administrator's Guide)
Related reference:
DYNAMIC_LOGS configuration parameter (Administrator's Reference)
The LTXHWM parameter still indicates how full the logical log is when the
database server starts to check for a possible long transaction and to roll it back.
LTXEHWM still indicates the point at which the database server suspends new
Under normal operations, use the default values for LTXHWM and LTXEHWM.
However, you might want to change these default values for one of the following
reasons:
v To allow other transactions to continue update activity (which requires access to
the log) during the rollback of a long transaction
In this case, you increase the value of LTXEHWM to raise the point at which the
long transaction rollback has exclusive access to the log.
v To perform scheduled transactions of unknown length, such as large loads that
are logged
In this case, you increase the value of LTXHWM so that the transaction has a
chance to complete before reaching the high watermark.
Related reference:
LTXEHWM configuration parameter (Administrator's Reference)
LTXHWM configuration parameter (Administrator's Reference)
If the sqlexec thread cannot find the available pages that it needs, the thread
initiates a foreground write and waits for pages to be freed. Foreground writes
impair performance, so you should avoid them. To reduce the frequency of
foreground writes, increase the number of page cleaners or decrease the threshold
for triggering a page cleaning.
If you increase the number of LRU queues, you must increase the number of
page-cleaner threads proportionally.
Related reference:
CLEANERS configuration parameter (Administrator's Reference)
For a single-processor system, set the lrus field of the BUFFERPOOL configuration
parameter to a minimum of 4. For multiprocessor systems, set the lrus field to a
minimum of 4 or to the number of CPU VPs, whichever is greater.
The lrus, lru_max_dirty, and lru_min_dirty values control how often pages are
flushed to disk between checkpoints. Automatic LRU tuning, as set by the
AUTO_LRU configuration parameter, affects all buffer pools and adjusts the
lru_min_dirty and lru_max_dirty values in the BUFFERPOOL configuration
parameter.
When the buffer pool is very large and transaction blocking is occurring during
checkpoint processing, look in the message log to determine which resource is
triggering transaction blocking. If the physical or logical log is critically low and
triggers transaction blocking, increase the size of the resource that is causing the
transaction blocking. If you cannot increase the size of the resource, consider
making LRU flushing more aggressive by decreasing the lru_min_dirty and
lru_max_dirty settings so that the server has fewer pages to flush to disk during
checkpoint processing.
To monitor the percentage of dirty pages in LRU queues, use the onstat -R
command. When the number of dirty pages consistently exceeds the lru_max_dirty
limit, you have too few LRU queues or too few page cleaners. First, use the
BUFFERPOOL configuration parameter to increase the number of LRU queues. If
the percentage of dirty pages still exceeds the lru_max_dirty limit, update the
CLEANERS configuration parameter to increase the number of page cleaners.
Related concepts:
“The BUFFERPOOL configuration parameter and memory utilization” on page
4-10
Number of LRU queues to configure (Administrator's Guide)
Related reference:
BUFFERPOOL configuration parameter (Administrator's Reference)
The following configuration parameters affect backup and restore on all operating
systems:
v BAR_MAX_BACKUP
v BAR_NB_XPORT_COUNT
v BAR_PROGRESS_FREQ
v BAR_XFER_BUF_SIZE
TAPEDEV specifies the tape device. TAPESIZE specifies the tape size for these
backups.
Related reference:
ON-Bar and ontape configuration parameters and environment variable
(Backup and Restore Guide)
The database server removes the plog_extend.servernum file when the first
checkpoint is performed during a fast recovery.
Related reference:
PLOG_OVERFLOW_PATH configuration parameter (Administrator's Reference)
The DRTIMEOUT configuration parameter specifies the interval for which either
database server waits for a transfer acknowledgment from the other. If the primary
database server does not receive the expected acknowledgment, it adds the
transaction information to the file named in the DRLOSTFOUND configuration
parameter. If the secondary database server receives no acknowledgment, it
changes the data-replication mode as the DRAUTO configuration parameter
specifies.
Related concepts:
Replication of primary-server data to secondary servers (Administrator's
Guide)
Fully synchronous mode for HDR replication (Administrator's Guide)
Nearly synchronous mode for HDR replication (Administrator's Guide)
Asynchronous mode for HDR replication (Administrator's Guide)
Related reference:
DRINTERVAL configuration parameter (Administrator's Reference)
DRTIMEOUT configuration parameter (Administrator's Reference)
DRLOSTFOUND configuration parameter (Administrator's Reference)
DRAUTO configuration parameter (Administrator's Reference)
HDR_TXN_SCOPE configuration parameter (Administrator's Reference)
onstat -g dri command: Print high-availability data replication information
(Administrator's Reference)
LRU tuning
The LRU settings for flushing each buffer pool between checkpoints are not critical
to checkpoint performance. The LRU settings are necessary only for maintaining
enough clean pages for page replacement.
The default settings for LRU flushing are 50 percent for lru_min_dirty and 60
percent for lru_max_dirty.
If your database server has been configured for more aggressive LRU flushing
because of checkpoint performance, you can decrease the LRU flushing at least to
the default values.
LRU flushing is reset to the values contained in the ONCONFIG file on which the
database server starts.
Issues include:
v Table placement on disk to increase throughput and reduce contention
v Space estimates for tables, blobpages, sbspaces, and extents
v Changes to tables that add or delete historical data
v Denormalization of the database to reduce overhead
Tables that the database server supports reside on one or more portions of a disk
or disks. You control the placement of a table on disk when you create it by
assigning it to a dbspace. A dbspace consists of one or more chunks. Each chunk
corresponds to all or part of a disk partition. When you assign chunks to dbspaces,
you make the disk space in those chunks available for storing tables or table
fragments.
When you configure chunks and allocate them to dbspaces, you must relate the
size of the dbspaces to the tables or fragments that each dbspace is to contain. To
estimate the size of a table, follow the instructions in “Estimating table size” on
page 6-5.
The database administrator (DBA) who is responsible for creating a table assigns
that table to a dbspace in one of the following ways:
v By using the IN DBSPACE clause of the CREATE TABLE statement
v By using the dbspace of the current database
The most recent DATABASE or CONNECT statement that the DBA issues before
issuing the CREATE TABLE statement sets the current database.
The DBA can fragment a table across multiple dbspaces, as described in “Planning
a fragmentation strategy” on page 9-1, or use the ALTER FRAGMENT statement to
move a table to another dbspace. The ALTER FRAGMENT statement provides the
simplest method for altering the placement of a table. However, the table is
unavailable while the database server processes the alteration. Schedule the
movement of a table or fragment at a time that affects the fewest users.
Moving tables between databases with LOAD and UNLOAD, onload and
onunload, or HPL involves periods in which data from the table is copied to tape
Depending on the size, fragmentation strategy, and indexes that are associated
with a table, it can be faster to unload a table and reload it than to alter
fragmentation. For other tables, it can be faster to alter fragmentation. You can
experiment to determine which method is faster for a table that you want to move
or re-partition.
Related concepts:
The onunload and onload utilities (Migration Guide)
Related tasks:
Moving data with external tables (Administrator's Guide)
Related reference:
ALTER FRAGMENT statement (SQL Syntax)
LOAD statement (SQL Syntax)
UNLOAD statement (SQL Syntax)
CREATE EXTERNAL TABLE Statement (SQL Syntax)
When disk drives have different performance levels, you can put the tables with
the highest use on the fastest drives. Placing two high-use tables on separate disk
devices reduces competition for disk access when the two tables experience
frequent, simultaneous I/O from multiple applications or when joins are formed
between them.
To isolate a high-use table on its own disk device, assign the device to a chunk,
assign that chunk to a dbspace, and then place the table in the dbspace that you
created. Figure 6-1 shows three high-use tables, each in a separate dbspace, placed
on three disks.
The following figure shows the placement of the most frequently accessed data on
partitions close to the middle band of the disk.
Disk platter
Figure 6-2. Disk platter with high-use table located on middle Partitions
To place high-use tables on the middle partition of the disk, create a raw device
composed of cylinders that reside midway between the spindle and the outer edge
of the disk. (For instructions on how to create a raw device, see the IBM Informix
Administrator's Guide for your operating system.) Allocate a chunk, associating it
with this raw device, as your IBM Informix Administrator's Reference describes. Then
create a dbspace with this same chunk as the initial and only chunk. When you
create a high-use table, place the table in this dbspace.
A dbspace can include multiple chunks, and each chunk can represent a different
disk. The maximum size for a chunk is 4 terabytes. This arrangement allows you
to distribute data in a dbspace over multiple disks. Figure 6-3 shows a dbspace
distributed over three disks.
Dbspace three_arms
Keep your logical logs and the physical log on separate devices to improve
performance by decreasing I/O contention on a single device. The logical and
physical logs are created in the root dbspace when the database server is
initialized. After initialization, you can move them to other dbspaces.
To define several dbspaces for temporary tables and sort files, use onspaces -t.
When you place these dbspaces on different disks and list them in the
DBSPACETEMP configuration parameter, you spread the I/O associated with
temporary tables and sort files across multiple disks, as Figure 6-4 illustrates. You
can list dbspaces that contain regular tables in DBSPACETEMP.
Users can specify their own lists of dbspaces for temporary tables and sort files
with the DBSPACETEMP environment variable. For details, see “Configure
dbspaces for temporary tables and sort files” on page 5-8.
For a description of size calculations for indexes, see “Estimating index pages” on
page 7-4.
The disk pages allocated to a table are collectively referred to as a tblspace. The
tblspace includes data pages. A separate tblspace includes index pages. If simple
large objects (TEXT or BYTE data) are associated with a table that is not stored in
an alternative dbspace, pages that hold simple large objects are also included in the
tblspace.
The tblspace does not correspond to any fixed region within a dbspace. The data
extents and indexes that make up a table can be scattered throughout the dbspace.
The size of a table includes all the pages within the tblspace: data pages and pages
that store simple large objects. Blobpages that are stored in a separate blobspace
are not included in the tblspace and are not counted as part of the table size.
The size of a table includes all the pages within the tblspace: data pages and pages
that store simple large objects. Blobpages that are stored in a separate blobspace or
on an optical subsystem are not included in the tblspace and are not counted as
part of the table size.
The following sections describe how to estimate the page count for each type of
page within the tblspace.
Tip: If an appropriate sample table exists, or if you can build a sample table of
realistic size with simulated data, you do not need to make estimates. You can run
oncheck -pt to obtain exact numbers.
Perform the following steps to estimate the size (in pages) of a table with
fixed-length rows.
Important: Although the maximum size of a row that the database server
accepts is approximately 32 kilobytes, performance degrades when a row
exceeds the size of a page. For information about breaking up wide tables for
improved performance, see “Denormalize the data model to improve
performance” on page 6-42.
6. If the size of the row is greater than pageuse, the database server divides the
row between pages. The page that contains the initial portion of a row is called
the home page. Pages that contains subsequent portions of a row are called
remainder pages. If a row spans more than two pages, some of the remainder
pages are completely filled with data from that row. When the trailing portion
of a row uses less than a page, it can be combined with the trailing portions of
other rows to fill out the partial remainder page. The number of data pages is
the sum of the home pages, the full remainder pages, and the partial remainder
pages.
a. Calculate the number of home pages.
The number of home pages is the same as the number of rows:
homepages = rows
b. Calculate the number of full remainder pages.
First calculate the size of the row remainder with the following formula:
remsize = rowsize - (pageuse + 8)
If remsize is less than pageuse - 4, you have no full remainder pages.
If remsize is greater than pageuse - 4, use remsize in the following formula to
obtain the number of full remainder pages:
fullrempages = rows * trunc(remsize/(pageuse - 8))
c. Calculate the number of partial remainder pages.
First calculate the size of a partial row remainder left after you have
accounted for the home and full remainder pages for an individual row. In
the following formula, the remainder() function notation indicates that you
are to take the remainder after division:
When a table contains one or more VARCHAR or NVARCHAR columns, its rows
can have varying lengths. These varying lengths introduce uncertainty into the
calculations. You must form an estimate of the typical size of each VARCHAR
column, based on your understanding of the data, and use that value when you
make the estimates.
Important: When the database server allocates space to rows of varying size, it
considers a page to be full when no room exists for an additional row of the
maximum size.
To estimate the size of a table with variable-length rows, you must make the
following estimates and choose a value between them, based on your
understanding of the data:
v The maximum size of the table, which you calculate based on the maximum
width allowed for all VARCHAR or NVARCHAR columns
v The projected size of the table, which you calculate based on a typical width for
each VARCHAR or NVARCHAR column
Based on your knowledge of the data, choose a value within that range that seems
most reasonable to you. The less familiar you are with the data, the more
conservative (higher) your estimate should be.
The blobpages can reside in either the dbspace where the table resides or in a
blobspace. For more information about when to use a blobspace, see “Storing
simple large objects in the tblspace or a separate blobspace.”
Alternatively, you can base your estimate on the median size of simple large
objects (TEXT or BYTE data); that is, the simple-large-object data size that occurs
most frequently. This method is less precise, but it is easier to calculate.
To estimate the number of blobpages based on the median size of simple large
objects:
1. Calculate the number of pages required for simple large objects of median size,
as follows:
mpages = ceiling(mblobsize/bpuse)
2. Multiply this amount by the total number of simple large objects, as follows:
blobpages = blobcount * mpages
You can also store simple large objects on optical media, but this discussion does
not apply to simple large objects stored in this way.
In the following example, a TEXT value is stored in the tblspace, and a BYTE value
is stored in a blobspace named rasters:
CREATE TABLE examptab
(
pic_id SERIAL,
pic_desc TEXT IN TABLE,
pic_raster BYTE IN rasters
)
A TEXT or BYTE value is always stored apart from the rows of the table; only a
56-byte descriptor is stored with the row. However, a simple large object occupies
at least one disk page. The simple large object to which the descriptor points can
reside in the same set of extents on disk as the table rows (in the same tblspace) or
in a separate blobspace.
When simple large objects are stored in the tblspace, the pages of their data are
interspersed among the pages that contain rows, which can greatly increase the
size of the table. When the database server reads only the rows and not the simple
large objects, the disk arm must move farther than when the blobpages are stored
apart. The database server scans only the row pages in the following situations:
v When it performs any SELECT operation that does not retrieve a
simple-large-object column
v When it uses a filter expression to test rows
Another consideration is that disk I/O to and from a dbspace is buffered in shared
memory of the database server. Pages are stored in case they are needed again
soon, and when pages are written, the requesting program can continue before the
actual disk write takes place. However, because blobspace data is expected to be
voluminous, disk I/O to and from blobspaces is not buffered, and the requesting
program is not allowed to proceed until all output has been written to the
blobspace.
Managing the size of first and next extents for the tblspace tblspace
The tblspace tblspace is a collection of pages that describe the location and
structure of all tblspaces in a dbspace. Each dbspace has one tblspace tblspace.
When you create a dbspace, you can use the TBLTBLFIRST and TBLTBLNEXT
configuration parameters to specify the first and next extent sizes for the tblspace
tblspace in a root dbspace.
You can use the onspaces utility to specify the initial and next extent sizes for the
tblspace tblspace in non-root dbspaces.
Specify the initial and next extent sizes if you want to reduce the number of
tblspace tblspace extents and reduce the frequency of situations when you need to
place the tblspace tblspace extents in non-primary chunks.
The ability to specify a first extent size that is larger than the default provides
flexibility for managing space. When you create an extent, you can reserve space
during creation of the dbspace, thereby decreasing the risk of needing additional
extents created in chunks that are not initial chunks.
You can only specify the first and next extent sizes when you create a dbspace. You
cannot alter the specification of the first and next extents sizes after the creation of
the dbspace. In addition, you cannot specify extent sizes for temporary dbspaces,
sbspaces, blobspaces, or external spaces.
If you do not specify first and next extent sizes for the tblspace tblspace, Informix
uses the existing default extent sizes.
Related tasks:
Specifying the first and next extent sizes for the tblspace tblspace
(Administrator's Guide)
Related reference:
TBLTBLFIRST configuration parameter (Administrator's Reference)
TBLTBLNEXT configuration parameter (Administrator's Reference)
Managing sbspaces
An sbspace is a logical storage unit composed of one or more chunks that store
smart large objects. You can estimate the amount of storage needed for smart large
objects, improve metadata I/O, monitor sbspaces, and change storage
characteristics.
If you add a chunk to the sbspace after the initial allocation, you can take one of
the following actions for metadata space:
v Allocate another metadata area on the new chunk by default.
This action provides the following advantages:
– It is easier because the database server automatically calculates and allocates a
new metadata area on the added chunk based on the average smart large
object size
– Distributes I/O operations on the metadata area across multiple disks
v Use the existing metadata area
If you specify the onspaces -U option, the database server does not allocate
metadata space in the new chunk. Instead it must use a metadata area in one of
the other chunks.
In addition, the database server reserves 40 percent of the user area to be used in
case the metadata area runs out of space. Therefore, if the allocated metadata
becomes full, the database server starts using this reserved space in the user area
for additional control information.
You can let the database server calculate the size of the metadata area for you on
the initial chunk and on each added chunks. However, you might want to specify
the size of the metadata area explicitly, to ensure that the sbspace does not run out
of metadata space and the 40 percent reserve area. You can use one of the
following methods to explicitly specify the amount of metadata space to allocate:
v Specify the AVG_LO_SIZE tag on the onspaces -Df option.
The database server uses this value to calculate the size of the metadata area to
allocate when the -Ms option is not specified. If you do not specify
AVG_LO_SIZE, the database server uses the default value of 8 kilobytes to
calculate the size of the metadata area.
v Specify the metadata area size in the -Ms option of the onspaces utility.
Use the procedure that “Sizing the metadata area manually for a new chunk”
describes to estimate a value to specify in the onspaces -Ms option.
The following procedure assumes that you know the sbspace size and need to
allocate more metadata space.
This topic contains an example showing how to estimate the metadata size
required for two sbspaces chunks.
Suppose the Metadata size field in the onstat -d option shows that the current
metadata area is 1000 pages. If the system page size is 2048 bytes, the size of this
metadata area is 2000 kilobytes, as the following calculation shows:
current metadata = (metadata_size * pagesize) / 1024
= (1000 * 2048) / 1024
= 2000 kilobytes
Suppose you expect 31,000 smart large objects in the two sbspace chunks. The
following formula calculates the total size of metadata area required for both
chunks, rounding up fractions:
Total metadata = (LOcount*570)/1024 + (numchunks*800) + 100
= (31,000 * 570)/1024 + (2*800) + 100
= 17256 + 1600 + 100
= 18956 kilobytes
You can distribute I/O to these pages in one of the following ways:
v Mirror the chunks that contain metadata.
Important: For highest data availability, mirror all sbspace chunks that contain
metadata.
Monitoring sbspaces
You can monitor the effectiveness of I/O operations on smart large objects. For
better I/O performance, all smart large objects should be allocated in one extent to
be contiguous.
For more information about sizing extents, see “Sbspace extents” on page 5-21.
You can use the following command-line utilities to monitor the effectiveness of
I/O operations on smart large objects:
v oncheck -cS, -pe and -pS
v onstat -g smb s option
Figure 6-5 shows an example of the output from the -cS option for s9_sbspc.
The values in the Sbs#, Chk#, and Seq# columns correspond to the Space Chunk
Page value in the -pS output. The Bytes and Pages columns display the size of
each smart large object in bytes and pages.
To calculate the average size of smart large objects, you can total the numbers in
the Size (Bytes) column and then divide by the number of smart large objects. In
Figure 6-5, the average number of bytes allocated is 2690, as the following
calculation shows:
Average size in bytes = (15736 + 98 + 97 + 62 + 87 + 56) / 6
= 16136 / 6
= 2689.3
For information about how to specify smart large object sizes to influence extent
sizes, see “Sbspace extents” on page 5-21.
Large Objects
ID Ref Size Allocced Creat Last
Sbs# Chk# Seq# Cnt (Bytes) Pages Extns Flags Modified
---- ---- ----- ---- ---------- -------- ----- ----- ------------------------
2 2 1 1 15736 8 1 N-N-H Thu Jun 21 16:59:12 2007
2 2 2 1 98 1 1 N-K-H Thu Jun 21 16:59:12 2007
2 2 3 1 97 1 1 N-K-H Thu Jun 21 16:59:12 2007
2 2 4 1 62 1 1 N-K-H Thu Jun 21 16:59:12 2007
2 2 5 1 87 1 1 N-K-H Thu Jun 21 16:59:12 2007
2 2 6 1 56 1 1 N-K-H Thu Jun 21 16:59:12 2007
The Extns field shows the minimum extent size, in number of pages, allocated to
each smart large object.
Execute oncheck -pe to display the following information to determine if the smart
large objects occupy contiguous space within an sbspace:
v Identifies each smart large object with the term SBLOBSpace LO
The three values in brackets following SBLOBSpace LO correspond to the Sbs#,
Chk#, and Seq# columns in the -cS output.
v Offset of each smart large object
v Number of disk pages (not sbpages) used by each smart large object
Figure 6-6 shows sample output. In this example, the size field shows that the first
smart large object occupies eight pages. Because the offset field shows that the first
smart large object starts at page 53 and the second smart large object starts at page
61, the first smart large object occupies contiguous pages.
Figure 6-6. oncheck -pe output that shows contiguous space use
Figure 6-7 on page 6-16 shows an example of the -pS output for s9_sbspc.
To display information about smart large objects, execute the following command:
oncheck -pS spacename
The oncheck -pS output displays the following information for each smart large
object in the sbspace:
v Space chunk page
v Size in bytes of each smart large object
v Object ID that DataBlade API and Informix ESQL/C functions use
v Storage characteristics of each smart large object
When you use onspaces -c -S to create an sbspace, you can use the -Df option to
specify various storage characteristics for the smart large objects. You can use
onspaces -ch to change attributes after the sbspace is created. The Create Flags
field in the oncheck -pS output displays these storage characteristics and other
attributes of each smart large object. In Figure 6-7 on page 6-16, the Create Flags
field shows LO_LOG because the LOGGING tag was set to ON in the -Df option.
Use the onstat -g smb s option to display the following characteristics that affect
the I/O performance of each sbspace:
v Logging status
If applications are updating temporary smart large objects, logging is not
required. You can turn off logging to reduce the amount of I/O activity to the
logical log, CPU utilization, and memory resources.
v Average smart-large-object size
Average size and extent size should be similar to reduce the number of I/O
operations required to read in an entire smart large object. The avg s/kb output
field shows the average smart-large-object size in kilobytes. In Figure 6-8 on
page 6-17, the avg s/kb output field shows the value 30 kilobytes.
Specify the final size of the smart large object in either of the following functions
to allocate the object as a single extent:
– The DataBlade API mi_lo_specset_estbytes function
– The Informix ESQL/C ifx_lo_specset_estbytes function
For more information about the functions to open a smart large object and to set
the estimated number of bytes, see the IBM Informix ESQL/C Programmer's
Manual and IBM Informix DataBlade API Programmer's Guide.
v First extent size, next extent size, and minimum extent size
The 1st sz/p, nxt sz/p, and min sz/p output fields show these extent sizes if you
set the extent tags in the -Df option of onspaces. In Figure 6-8 on page 6-17,
these output fields show values of 0 and -1 because these tags are not set in
onspaces.
Table 6-1 summarizes the ways to alter the storage characteristics for a smart large
object.
Table 6-1. Altering storage characteristics and other attributes of an sbspace
System-Specified Column-Level Storage
Storage Characteristics Storage Storage
Characteristics Specified by PUT Characteris-tics Characteris-tics
Storage Specified by -Df clause of CREATE Specified by a Specified by an
Character-istic System Default Option in onspaces TABLE or ALTER DataBlade API ESQL/C
or Attribute Value Utility TABLE Function Function
Last-access OFF ACCESSTIME KEEP ACCESS TIME, Yes Yes
time NO KEEP ACCESS
TIME
Lock mode BLOB LOCK_MODE No Yes Yes
Logging status OFF LOGGING LOG, NO LOG Yes Yes
Data integrity HIGH INTEG No HIGH INTEG, Yes No
MODERATE INTEG
Size of extent None EXTENT_SIZE EXTENT SIZE Yes Yes
Size of next None NEXT_SIZE No No No
extent
Minimum 2 kilobytes on MIN_EXT_SIZE No No No
extent size Windows 4
kilobytes on
UNIX
Size of smart 8 kilobytes Average size of all No Estimated size of Estimated size
large object smart large objects a particular smart of a particular
in sbspace: large object smart large
AVG_LO_SIZE Maximum size of object Maximum
a particular smart size of a
large object particular smart
large object
Buffer pool ON BUFFERING No LO_BUFFER and LO_BUFFER
usage LO_ NOBUFFER and LO_
flags NOBUFFER
flags
Name of SBSPACE- Not in -Df option. Name of an existing Yes Yes
sbspace NAME Name specified in sbspace in which a
onspaces -S option. smart large object
resides: PUT ... IN
clause
Fragmenta- None No Round-robin Round-robin or Round-robin or
tion across distribution scheme: expression-based expression-based
multiple PUT ... IN clause distribution distribution
sbspaces scheme scheme
Last-access OFF ACCESSTIME KEEP ACCESS TIME, Yes Yes
time NO KEEP ACCESS
TIME
Later, you can use the PUT clause of the ALTER TABLE statement to change the
optional storage characteristics of these columns. Table 6-1 on page 6-18 shows
which characteristics and attributes you can change.
You can use the PUT clause of the ALTER TABLE statement to perform the
following actions:
v Specify the smart-large-object characteristics and storage location when you add
a new column to a table.
The smart large objects in the new columns can have different characteristics
than those in the existing columns.
v Change the smart-large-object characteristics of an existing column.
The new characteristics of the column apply only to new smart large objects
created for that column. The characteristics of existing smart large objects remain
the same.
For example, the BLOB data in the catalog table in the superstores_demo database
is stored in s9_sbspc with logging turned off and has an extent size of 100
kilobytes. You can use the PUT clause of the ALTER TABLE statement to turn on
logging and store new smart large objects in a different sbspace.
For information about changing sbspace extents with the CREATE TABLE
statement, see “Extent sizes for smart large objects in sbspaces” on page 6-22.
Related reference:
Sbspace logging (Administrator's Guide)
CREATE TABLE statement (SQL Syntax)
Managing extents
As you add rows to a table, the database server allocates disk space in units called
extents. Each extent is a block of physically contiguous pages from the dbspace.
Even when the dbspace includes more than one chunk, each extent is allocated
entirely within a single chunk, so that it remains contiguous.
If you have a table that needs more extents and the database server runs out of
space on the partition header page, the database server automatically allocates
extended secondary partition header pages to accommodate new extent entries.
The database server can allocate up to 32767 extents for any partition, unless the
size of a table dictates a limit to the number of extents.
Because table sizes are not known, the database server cannot preallocate table
space. Therefore, the database server adds extents only as they are needed, but all
the pages in any one extent are contiguous for better performance. In addition,
when the database server creates an extent that is next to the previous one, it treats
both as a single extent.
A frequently updated table can become fragmented over time which degrades the
performance every time the table is accessed by the server. Defragmenting a table
brings data rows closer together and avoids partition header page overflow
problems.
The following sample SQL statement creates a table with a 512-kilobyte initial
extent and 100-kilobyte added extents:
CREATE TABLE big_one (...column specifications...)
IN big_space
EXTENT SIZE 512
NEXT SIZE 100
The default value for the extent size and the next-extent size is eight times the disk
page size on your system. For example, if you have a 2-kilobyte page, the default
length is 16 kilobytes.
You can use the ALTER TABLE statement with the MODIFY EXTENT SIZE clause
to change the size of the first extent of a table in a dbspace. When you change the
You might want to change the size of the first extent of a table in a dbspace in
either of these situations:
v If a table was created with small first extent size and you need to keep adding
a lot of next extents, the table becomes fragmented across multiple extents and
the data is scattered.
v If a table was created with a first extent that is much larger than the amount of
data that is stored, space is wasted.
The following example changes the size of the first extent of a table in a dbspace to
50 kilobytes:
ALTER TABLE customer MODIFY EXTENT SIZE 50;
Changes to the first extent size are recorded into the system catalog table and on
the partition page on the disk. However, changes to the first extent size do not take
effect immediately. Instead, whenever a change that rebuilds the table occurs, the
server uses the new first extent size.
For example, if a table has a first extent size of 8 kilobytes and you use the ALTER
TABLE statement to change this to 16 kilobytes, the server does not drop the
current first extent and recreate it with the new size. Instead, the new first extent
size of 16 kilobytes takes effect only when the server rebuilds the table after actions
such as creating a cluster index on the table or detaching a fragment from the
table.
Use the MODIFY NEXT SIZE clause to change the size of the next extent to be
added. This change does not affect next extents that already exist.
The following example changes the size of the next extent of a table to 50
kilobytes:
ALTER TABLE big_one MODIFY NEXT SIZE 50;
The next extent sizes of the following kinds of tables do not affect performance
significantly:
v A small table is defined as a table that has only one extent. If such a table is
heavily used, large parts of it remain buffered in memory.
v An infrequently used table is not important to performance no matter what size
it is.
v A table that resides in a dedicated dbspace always receives new extents that are
adjacent to its old extents. The size of these extents is not important because,
being adjacent, they perform as one large extent.
When you assign an extent size to these kinds of tables, the only consideration is
to avoid creating large numbers of extents. A large number of extents causes the
database server to spend extra time finding the data. In addition, an upper limit
No upper limit exists on extent sizes except the size of the chunk. The maximum
size for a chunk is 4 terabytes. When you know the final size of a table (or can
confidently predict it within 25 percent), allocate all its space in the initial extent.
When tables grow steadily to unknown size, assign them next-extent sizes that let
them share the dbspace with a small number of extents each.
As the dbspace fills up, you might not have enough contiguous space to create an
extent of the specified size. In this case, the database server allocates the largest
contiguous extent that it can.
Related reference:
TBLTBLFIRST configuration parameter (Administrator's Reference)
TBLTBLNEXT configuration parameter (Administrator's Reference)
MODIFY EXTENT SIZE (SQL Syntax)
If the unfragmented table was defined with a large next-extent size, the database
server uses that same size for the next-extent on each fragment, which results in
over-allocation of disk space. Each fragment requires only a proportion of the
space for the entire table.
For example, if you fragment the preceding big_one sample table across five disks,
you can alter the next-extent size to one-fifth the original size. The following
example changes the next-extent size to one-fifth of the original size:
ALTER TABLE big_one MODIFY NEXT SIZE 2;
Related reference:
MODIFY NEXT SIZE clause (SQL Syntax)
For more information about sizing extents, see “Sbspace extents” on page 5-21. For
more information, see “Monitoring sbspaces” on page 6-13.
Output from the onstat -t option includes the tblspace number and the following
four fields.
Field Description
npages
Pages allocated to the tblspace
nused Pages used from this allocated pool
nextns Number of extents used
npdata
Number of data pages used
If a specific operation needs more pages than are available (npages minus nused),
a new extent is required. If enough space is available in this chunk, the database
server allocates the extent here; if not, the database server looks for space in other
available chunks. If none of the chunks contains adequate contiguous space, the
database server uses the largest block of contiguous space that it can find in the
dbspace. Figure 6-9 shows an example of the output from this option.
Tblspaces
n address flgs ucnt tblnum physaddr npages nused npdata nrows nextns
0 422528 1 1 100001 10000e 150 124 0 0 3
1 422640 1 1 200001 200004 50 36 0 0 1
54 426038 1 6 100035 1008ac 3650 3631 3158 60000 3
62 4268f8 1 6 100034 1008ab 8 6 4 60 1
63 426a10 3 6 100036 1008ad 368 365 19 612 3
64 426b28 1 6 100033 1008aa 8 3 1 6 1
193 42f840 1 6 10001b 100028 8 5 2 30 1
7 active, 200 total, 64 hash buckets
To help ensure that the limit is not exceeded, the database server performs the
following actions:
v The database server checks the number of extents each time that it creates an
extent. If the number of the extent being created is a multiple of 16, the database
server automatically doubles the next-extent size for the table. Therefore, at
every 16th creation, the database server doubles the next-extent size.
v When the database server creates an extent next to the previous extent, it treats
both extents as a single extent.
Interleaving creates gaps between the extents of a table. Figure 6-10 shows gaps
between table extents.
Table 1 extents
Table 2 extents
Table 3 extents
Try to optimize the table-extent sizes to allocate contiguous disk space, which
limits head movement. Also consider placing the tables in separate dbspaces.
This output is useful for determining the degree of extent interleaving. If the
database server cannot allocate an extent in a chunk despite an adequate number
of free pages, the chunk might be badly interleaved.
You can rebuild a dbspace to eliminate interleaved extents so that the extents for
each table are contiguous.
The order of the reorganized tables within the dbspace is not important, but the
pages of each reorganized table should be contiguous so that no lengthy seeks are
required to read the table sequentially. When the disk arm reads a table
nonsequentially, it ranges only over the space that table occupies.
Table 1 extents
Table 2 extents
Table 3 extents
The LOAD statement re-creates the tables with the same properties they had
before, including the same extent sizes.
You can also unload a table with the onunload utility and reload the table with the
companion onload utility.
Related concepts:
The onunload and onload utilities (Migration Guide)
Related reference:
LOAD statement (SQL Syntax)
UNLOAD statement (SQL Syntax)
The TO CLUSTER clause reorders rows in the physical table to match the order in
the index. For more information, see “Clustering” on page 7-11.
To display the location and size of the blocks of free space, execute the oncheck
-pe command.
You do not need to drop an index before you cluster it. However, the ALTER
INDEX process is faster than CREATE INDEX because the database server reads
the data rows in cluster order using the index. In addition, the resulting indexes
are more compact.
To prevent the problem from recurring, consider increasing the size of the tblspace
extents.
If you use the ALTER TABLE statement to add or drop a column or to change the
data type of a column, the database server copies and reconstructs the table. When
the database server reconstructs the entire table, it rewrites the table to other areas
of the dbspace. However, if other tables are in the dbspace, no guarantee exists
that the new extents will be adjacent to each other.
Important: For certain types of operations that you specify in the ADD, DROP,
and MODIFY clauses, the database server does not copy and reconstruct the table
during the ALTER TABLE operation. In these cases, the database server uses an
in-place alter algorithm to modify each row when it is updated (rather than during
the ALTER TABLE operation). For more information about the conditions for this
in-place alter algorithm, see “In-place alter” on page 6-35.
Important: When you delete rows in a table, the database server reuses that space
to insert new rows into the same table. This section describes the procedures for
reclaiming unused space for use by other tables.
You might want to resize a table that does not require the entire amount of space
that was originally allocated to it. You can reallocate a smaller dbspace and release
the unneeded space for other tables to use.
When you run the ALTER INDEX statement with the TO CLUSTER clause, all of
the extents associated with the previous version of the table are released. Also, the
newly built version of the table has no empty extents.
Related concepts:
Clustering (Performance Guide)
Related reference:
ALTER INDEX statement (SQL Syntax)
For more information about the syntax of the ALTER FRAGMENT statement, see
the IBM Informix Guide to SQL: Syntax.
For more information about using TRUNCATE, see the IBM Informix Guide to SQL:
Syntax.
A frequently updated table can become fragmented over time which degrades the
performance every time the table is accessed by the server. Defragmenting a table
brings data rows closer together and avoids partition header page overflow
problems.
Defragmenting an index brings the entries closer together which improves the
speed at which the table information is accessed.
You cannot stop a defragment request after the request has been submitted.
Additionally, there are specific objects that cannot be defragmented and you cannot
defragment a partition if another operation is running that conflicts with the
defragment request.
You can use the onstat -g defragment command to display information about the
active defragment requests.
Related tasks:
Scheduling data optimization (Administrator's Guide)
Related reference:
onstat -g defragment command: Print defragment partition extents
(Administrator's Reference)
oncheck -pt and -pT: Display tblspaces for a Table or Fragment
(Administrator's Reference)
defragment argument: Dynamically defragment partition extents (SQL
administration API) (Administrator's Reference)
You can also use this feature to improve query performance over storing each
fragment in a different dbspace when a dbspace is located on a faster device.
For more information, see information about managing partitions in the IBM
Informix Administrator's Guide.
For an example of onstat -g opn output and an explanation of output fields, see
the IBM Informix Administrator's Reference.
You can use one or more of the following methods to load large tables quickly:
v External tables
v Nonlogging tables
The database server provides support to:
– Create nonlogging or logging tables in a logging database.
– Alter a table from nonlogging to logging and vice versa.
The two table types are STANDARD (logging tables) and RAW (nonlogging
tables). You can use any loading utility such as dbimport or HPL to load raw
tables.
v High-Performance Loader (HPL)
You can use HPL in express mode to load tables quickly.
OLTP applications usually use standard tables. OLTP applications typically have
the following characteristics:
v Real-time insert, update, and delete transactions
Logging and recovery of these transactions is critical to preserve the data.
Locking is critical to allow concurrent access and to ensure the consistency of the
data selected.
v Update, insert, or delete one row or a few rows at a time
Indexes speed access to these rows. An index requires only a few I/O operations
to access the pertinent row, but scanning a table to find the pertinent row might
require many I/O operations.
You can change a large, existing standard table into a nonlogging table and then
load the table.
You quickly create a new nonlogging table and load the table.
For more information about when to drop indexes, see “Nonunique indexes” on
page 7-12.
If you cannot guarantee that the loaded data satisfies all unique constraints, you
must create unique indexes before you load the rows. You save time if the rows are
presented in the correct sequence for at least one of the indexes. If you have a
choice, make it the row with the largest key. This strategy minimizes the number
of leaf pages that must be read and written.
When you use the ALTER TABLE ADD CONSTRAINT or ALTER TABLE MODIFY
statement to define a foreign-key constraint on an existing table, you might be able
to reduce the time required to validate of the new foreign-key constraint, if the
referenced table already has a unique index or a primary-key constraint on the
column corresponding to the key of the foreign-key constraint. When it creates a
foreign-key constraint on a table that already contains data, the database server
For large tables, scanning only the index values can provide substantial
performance improvement, unless one of the following requirements is not
satisfied:
v The ALTER TABLE statement is creating only one foreign-key constraint.
v The ALTER TABLE statement is not also creating or enabling a CHECK
constraint.
v The ALTER TABLE statement is not also changing the data type of any existing
column in the table.
v The foreign-key columns do not include user-defined data types (UDTs) or
built-in opaque data types.
v The new mode of the foreign-key constraint is not DISABLED.
v The table is not associated with an active violation table.
Except in the case of one or more violating rows, the ALTER TABLE ADD
CONSTRAINT or ALTER TABLE MODIFY statement can create and validate a
foreign-key constraint when some of these requirements are not satisfied, but the
database server will not consider using the index-key algorithm to validate the
foreign-key constraint. The additional validation costs to scan the entire table tend
to be proportional to the size of the table.
Unless the table has one or more violating rows, the SET CONSTRAINTS
statement can enable and validate a foreign-key constraint when some of these
requirements are not satisfied, but the database server will not consider using the
index-key algorithm to validate the foreign-key constraint. The additional
validation costs for a full table scan can be substantial for very large tables.
In both the ALTER TABLE and SET CONSTRAINTS operations described above,
the goal was to use a more efficient algorithm for validating the referential
constraint. Greater efficiencies can be achieved, at least temporarily, by postponing
or avoiding the validation of ENABLED or FILTERING foreign-key constraints that
are being created by ALTER TABLE ADD CONSTRAINT statements, or while a
DISABLED foreign-key constraint is being reset to an ENABLED or FILTERING
mode.
Three alternative mechanisms are available for bypassing the validation of enabled
or filtering foreign-key constraints while they are being created, or while they are
being exported, or while their mode is being changed from DISABLED:
v You can include the NOVALIDATE keyword in the constraint mode specification
– of the ALTER TABLE ADD CONSTRAINT statement,
– or of the SET CONSTRAINTS ENABLED statement,
– or of the SET CONSTRAINTS FILTERING WITH ERROR statement,
– or of the SET CONSTRAINTS FILTERING WITHOUT ERROR statements.
v If you plan to run multiple ALTER TABLE ADD CONSTRAINT or SET
CONSTRAINTS statements, run the SET ENVIRONMENT NOVALIDATE ON
statement to disable the validation of foreign-key constraints during the current
session.
Setting this session environment option makes NOVIOLATE the default mode
for enabled or filtering referential constraints while the DDL statement is
running.
v If you are migrating data, include the -nv option in the dbimport command.
The effect of the -nv option is that the constraint modes of any ALTER TABLE
ADD CONSTRAINT or SET CONSTRAINTS statements that create or enable
foreign-key constraints are processed so that the ENABLED, or FILTERING
WITH ERROR, or FILTERING WITHOUT ERROR constraint mode specifications
are instead implemented (respectively) as the ENABLED NOVALIDATE, or
FILTERING WITH ERROR NOVALIDATE, or FILTERING WITHOUT ERROR
NOVALIDATE modes.
In each case, no constraint validation of existing rows occurs during the DDL
statement.
The effect of the NOVALIDATE keyword or of the -nv command-line flag does not
persist outside the operation that created or changed the mode of the foreign-key
constraint. The same constraint enforces referential integrity during subsequent
DELETE, INSERT, MERGE, and UPDATE operations.
If the NOVALIDATE feature is used on a table that might already contains rows
that violate the foreign-key constraint, it is the responsibility of the user to verify
that no violating rows exist in the data.
Slow alter
When the database server uses the slow alter algorithm to process an ALTER
TABLE statement, the table can be unavailable to other users for a long period of
time.
Because the database server makes a copy of the table to convert the table to the
new definition, a slow alter operation requires space at least twice the size of the
original table plus log space.
The database server uses the slow alter algorithm when the ALTER TABLE
statement makes column changes that it cannot perform in place:
v Adding or dropping a column created with the ROWIDS keyword
v Adding or dropping a column created with the REPLCHECK keyword
v Dropping a column of the TEXT or BYTE data type
v Modifying a SMALLINT column to SERIAL, SERIAL8, or BIGSERIAL
v Converting an INT column to SERIAL, SERIAL8, or BIGSERIAL
v Modifying the data type of a column so that some possible values of the old
data type cannot be converted to the new data type (For example, if you modify
a column of data type INTEGER to CHAR(n), the database server uses the slow
alter algorithm if the value of n is less than 11. An INTEGER requires 10
characters plus one for the minus sign for the lowest possible negative values.)
v Modifying the data type of a fragmentation column in a way that value
conversion might cause rows to move to another fragment
v Adding, dropping or modifying any column when the table contains
user-defined data types, smart large objects, or LVARCHAR, SET, MULTISET,
ROW, or COLLECTION data types
v Modifying the original size or reserve specifications of VARCHAR or
NVARCHAR columns
v Adding ERKEY shadow columns
In-place alter
The in-place alter algorithm provides numerous performance advantages over the
slow alter algorithm
If the check_for_ipa Scheduler task is enabled, each table that has one or more
outstanding in-place alter operations is listed in the ph_alert table in the sysadmin
database. The alert text is: Table database:owner.table_name has outstanding in
place alters. The alert type is informative.
Related reference:
The ph_alert Table (Administrator's Reference)
The database server can use the in-place alter algorithm to process only certain
ADD, DROP, or MODIFY operations of the ALTER TABLE statement, and only if
the table schema or the ALTER TABLE statement does not require a slow alter
algorithm.
The database server can use the in-place alter algorithm in the following ALTER
TABLE operations:
v Add columns of built-in data types, except the data types that are listed in
“Conditions that prevent in-place alter operations” on page 6-38.
v Drop a column of built-in data types, except a column that contains TEXT or
BYTE data types, or a column that was created with the ROWIDS keyword.
v In Enterprise Replication, add or drop a column that is created with the
CRCOLS keyword.
The following table shows the conditions under which the ALTER TABLE MODIFY
statement uses the in-place alter algorithm to convert columns of supported data
types.
Key:
All = The database server uses the in-place alter algorithm for all cases of the
specific column operation.
nf = The database server uses the in-place alter algorithm when the modified
column is not part of the table fragmentation expression.
Table 6-2. MODIFY operations and conditions that use the in-place alter algorithm
Operation on Column Condition
Convert a SMALLINT column to an INTEGER column All
Convert a SMALLINT column to a BIGINT column All
Convert a SMALLINT column to an INT8 column All
Convert a SMALLINT column to a DEC(p2,s2) column p2-s2 >= 5
Convert a SMALLINT column to a DEC(p2) column p2-s2 >= 5 OR nf
Convert a SMALLINT column to a SMALLFLOAT column All
Convert a SMALLINT column to a FLOAT column All
Convert a SMALLINT column to a CHAR(n) column n >= 6 AND nf
Convert an INT column to an INT8 column All
Convert an INT column to a DEC(p2,s2) column p2-s2 >= 10
Convert an INT column to a DEC(p2) column p2 >= 10 OR nf
Convert an INT column to a SMALLFLOAT column nf
Convert an INT column to a FLOAT column All
Convert an INT column to a CHAR(n) column n >= 11 AND nf
Convert a SERIAL column to an INT8 column All
Convert a SERIAL column to a DEC(p2,s2) column p2-s2 >= 10
Convert a SERIAL column to a DEC(p2) column p2 >= 10 OR nf
Convert a SERIAL column to a SMALLFLOAT column nf
Convert a SERIAL column to a FLOAT column All
Convert a SERIAL column to a CHAR(n) column n >= 11 AND nf
Convert a SERIAL column to a BIGSERIAL column All
Convert a SERIAL column to a SERIAL8 column All
Convert a SERIAL8 column to a BIGSERIAL column All
Convert a BIGSERIAL column to a SERIAL8 column All
Convert a DEC(p1,s1) column to a SMALLINT column p1-s1 < 5 AND (s1 == 0
OR nf)
Convert a DEC(p1,s1) column to an INTEGER column p1-s1 < 10 AND (s1 == 0
OR nf)
When the table contains an opaque data type, a user-defined data type, an
LVARCHAR data type, a BOOLEAN data type, or a smart large object (BLOB or
CLOB), the database server does not use the in-place alter algorithm, even when
the column that is being altered is of a data type that can support in-place alter
operations.
The in-place alter algorithm is not used if the ALTER TABLE DROP statement
specifies BYTE or TEXT columns, or the ROWIDS keyword, or if the ALTER
TABLE ADD statement includes the ROWID keyword.
For example, the database server does not use the in-place alter algorithm in the
following situations:
v When more than one algorithm is needed
For example, assume that an ALTER TABLE MODIFY statement converts a
SMALLINT column to a DEC(8,2) column and converts an INTEGER column to
a CHAR(8) column. The conversion of the first column is an in-place alter
operation, but the conversion of the second column is a slow alter operation.
The database server uses the slow alter algorithm to execute this statement.
v When the ALTER TABLE operation moves data records to another fragment
For example, suppose you have a table with two integer columns and the
following fragment expression:
col1 < col2 IN dbspace1, REMAINDER IN dbspace2
If you issue an ALTER TABLE MODIFY statement to convert the integer values
to character values, the database server stores the row (4, 30) in dbspace1
before the alter operation, but stores it in dbspace2 after the alter operation, not
as integers, 4 < 30, but as characters, ’30’ < ’4’.
v When the database server cannot convert all possible values of the old data type
to the new data type.
For example, you cannot convert a BIGSERIAL column to a SERIAL column,
because the modified column cannot store BIGSERIAL values that are beyond
the range of SERIAL values. (However, you can change a column from SERIAL
to BIGSERIAL with an in-place alter operation, if other columns in the table do
not conflict with any of the other restrictions on in-place alter operations.)
Related concepts:
IBM Informix data types (Database Design Guide)
Related reference:
DECIMAL (SQL Reference)
Each time you execute an ALTER TABLE statement that uses the in-place alter
algorithm, the database server creates a new version of the table structure. The
database server keeps track of all versions of table definitions. The database server
resets the version status and all of the version structures and alter structures until
the entire table is converted to the final format, or until a slow alter is performed.
If the database server detects any down-level version page during the execution of
DML statements (INSERT, UPDATE, DELETE, and SELECT statements, and
MERGE statements that specify Insert, Update, or Delete clauses), it performs the
following actions:
v For UPDATE statements, the database server converts the entire data page or
pages to the final format.
In-place alter operations on data definition language (DDL) statements can impact
performance. Therefore, you should monitor outstanding in-place alters, because
many outstanding alters affects subsequent ALTER TABLE statements.
The oncheck -pT tablename option displays data-page versions for outstanding
in-place alter operations. An in-place alter is outstanding when data pages still exist
with the old definition.
Figure 6-12 displays a portion of the output that the following oncheck command
produces after four in-place alter operations are run on the customer
demonstration table:
...
Home Data Page Version Summary
Version Count
0 (oldest) 2
1 0
2 0
3 0
4 (current) 0
...
Figure 6-12. Sample oncheck -pT output for the customer table
The Count field in Figure 6-12 displays the number of pages that currently use that
version of the table definition. This oncheck output shows that four versions are
outstanding:
v A value of 2 in the Count field for the oldest version indicates that two pages
use the oldest version.
v A value of 0 in the Count fields for the next four versions indicates that no
pages were to the latest table definition.
If your goal is saving runtime CPU, then plan to keep as few outstanding alters
operations on a table as possible (generally no more than 3 or 4). If your goal is to
save on disk space and your alter operations add or grow columns, then leaving in
place alters outstanding helps reduce disk space.
You can convert data pages to the latest definition with a dummy UPDATE
statement. For example, the following statement, which sets a column value to the
existing value, causes the database server to convert data pages to the latest
definition:
UPDATE tab1 SET col1 = col1;
This statement does not change any data values, but it converts the format of the
data pages to the latest definition.
After an update is executed on all pages of the table, the oncheck -pT command
displays the total number of data pages in the Count field for the current version
of the table.
Related concepts:
Run dummy UPDATE statements (Migration Guide)
If the altered column is part of an index, the table is still altered in place, but in
this case the database server rebuilds the index or indexes implicitly. If you do not
need to rebuild the index, you should drop or disable it before you perform the
alter operation. Taking these steps improves performance.
However, if the column that you modify is a primary key or foreign key and you
want to keep this constraint, you must specify those keywords again in the ALTER
TABLE statement, and the database server rebuilds the index.
For example, suppose you create tables and alter the parent table with the
following SQL statements:
CREATE TABLE parent
(si SMALLINT PRIMARY KEY CONSTRAINT pkey);
CREATE TABLE child
(si SMALLINT REFERENCES parent ON DELETE CASCADE
CONSTRAINT ckey);
INSERT INTO parent (si) VALUES (1);
INSERT INTO parent (si) VALUES (2);
INSERT INTO child (si) VALUES (1);
INSERT INTO child (si) VALUES (2);
ALTER TABLE parent
MODIFY (si INT PRIMARY KEY CONSTRAINT pkey);
Even though the ALTER TABLE operation on a primary key or foreign key column
rebuilds the index, the database server still takes advantage of the in-place alter
algorithm. The in-place alter algorithm can provide performance benefits, including
the following:
v It does not make a copy of the table in order to convert the table to the new
definition.
v It does not convert the data rows during the alter operation.
v It does not rebuild all indexes on the table.
Warning: If you alter a table that is part of a view, you must re-create the view to
obtain the latest definition of the table.
Fast alter
The database server uses the fast alter algorithm when the ALTER TABLE
statement changes attributes of the table but does not affect the data.
The database server uses the fast alter algorithm when you use the ALTER TABLE
statement to:
v Change the next-extent size.
v Add or drop a constraint.
v Change the lock mode of the table.
v Change the unique index attribute without modifying the column type.
v Add shadow columns for row versioning with the ADD VERCOLS keywords.
With the fast alter algorithm, the database server holds the lock on the table for
just a short time. In some cases, the database server locks the system catalog tables
only to change the attribute. In either case, the table is unavailable for queries for
only a short time.
The entity-relationship data model, which the IBM Informix Guide to SQL: Tutorial
describes, produces tables that contain no redundant or derived data. According to
the tenets of relational database theory, these tables are well structured.
Sometimes, to meet extraordinary demands for high performance, you might need
to denormalize the data model by modifying it in ways that are undesirable from a
theoretical standpoint. This section describes some modifications and their
associated costs.
Shortening rows
Usually, tables with shorter rows yield better performance than those with longer
rows because disk I/O is performed in pages, not in rows. The shorter the rows of
a table, the more rows occur on a page. The more rows per page, the fewer I/O
The entity-relationship data model puts all the attributes of one entity into a single
table for that entity. For some entities, this strategy can produce rows of awkward
lengths.
To shorten the rows, you can break columns into separate tables that are associated
by duplicate key values in each table. As the rows get shorter, query performance
should improve.
For information about other character data types, see the IBM Informix GLS User's
Guide.
The column within the row page is only 56 bytes long, which allows more rows on
a page than when you include a long string. However, the TEXT data type is not
automatically compatible with existing programs. The application needed to fetch a
TEXT value is a bit more complicated than the code for fetching a CHAR value
into a program.
If you split a table into two tables, the primary table and a companion table, repeat
the primary key in each table.
For example, the customer.city column contains city names. Some city names are
repeated in the column, and most rows have some trailing blanks in the field.
Using the VARCHAR data type eliminates the blanks but not the duplication.
You can create a table named cities, as the following example shows:
CREATE TABLE cities (
city_num SERIAL PRIMARY KEY,
city_name VARCHAR(40) UNIQUE
)
You can change the definition of the customer table so that its city column
becomes a foreign key that references the city_num column in the cities table.
To insert the city of the new customer into cities, you must change any program
that inserts a new row into customer. The database server return code in the
SQLCODE field of the SQL Communications Area (SQLCA) can indicate that the
insert failed because of a duplicate key. It is not a logical error; it simply means
that an existing customer is located in that city. For more information about the
SQLCA, see the IBM Informix Guide to SQL: Tutorial.
Besides changing programs that insert data, you must also change all programs
and stored queries that retrieve the city name. The programs and stored queries
must use a join to the new cities table in order to obtain their data. The extra
complexity in programs that insert rows and the extra complexity in some queries
is the result of giving up theoretical correctness in the data model. Before you
make the change, be sure that it returns a reasonable savings in disk space or
execution time.
The shorter rows allow you to query or update each table quickly.
Division by Bulk
One principle on which you can divide an entity table is bulk. Move the bulky
attributes, which are usually character strings, to the companion table. Keep the
numeric and other small attributes in the primary table. In the demonstration
database, you can split the ship_instruct column from the orders table. You can
call the companion table orders_ship. It has two columns, a primary key that is a
copy of orders.order_num and the original ship_instruct column.
Updates take longer than queries, and updating programs lock index pages and
rows of data during the update process, preventing querying programs from
accessing the tables. If you can separate one table into two companion tables, one
with the most-updated entities and the other with the most-queried entities, you
can often improve overall response time.
Splitting a table uses extra disk space and adds complexity. Two copies of the
primary key occur for each row, one copy in each table. Two primary-key indexes
also exist. You can use the methods described in earlier sections to estimate the
number of added pages.
You must modify existing programs, reports, and forms that use SELECT * because
fewer columns are returned. Programs, reports, and forms that use attributes from
both tables must perform a join to bring the tables together.
In this case, when you insert or delete a row, two tables are altered instead of one.
If you do not coordinate the alteration of the two tables (by making them within a
single transaction, for example), you lose semantic integrity.
Redundant data
Normalized tables contain no redundant data. Every attribute appears in only one
table.
Normalized tables also contain no derived data. Instead, data that can be
computed from existing attributes is selected as an expression based on those
attributes.
Normalizing tables minimizes the amount of disk space used and makes updating
the tables as easy as possible. However, normalized tables can force you to use
joins and aggregate functions often, and those processes can be time consuming.
As an alternative, you can introduce new columns that contain redundant data,
provided you understand the trade-offs involved.
The contents of manufact are primarily a supplement to the stock table. Suppose
that a time-critical application frequently refers to the delivery lead time of a
particular product but to no other column of manufact. For each such reference,
the database server must read two or three pages of data to perform the lookup.
Like derived data, redundant data takes space and poses an integrity risk. In the
example described in the previous paragraph, many extra copies of the lead time
for each manufacturer can exist. (Each manufacturer can appear in stock many
times.) The programs that insert or update a row of manufact must also update
multiple rows of stock.
The integrity risk is simply that the redundant copies of the data might not be
accurate. If a lead time is changed in manufact, the stock column is outdated until
it is also updated. As you do with derived data, define the conditions under which
redundant data might be wrong.
For more information about database design, see the IBM Informix Database Design
and Implementation Guide.
Potential advantages of allowing more variable length rows per page are:
v Reducing the disk space required to store data
v Enabling the server to use the buffer pool more efficiently
v Reducing table scan times
Compressing data, consolidating data, and returning free space have the following
benefits:
v Significant savings in disk storage space
v Reduced disk usage for compressed fragments
v Significant saving of logical log usage, which saves additional space and can
prevent bottlenecks for high-throughput OLTP after the compression operation is
completed.
v Fewer page reads, because more rows can fit on a page
v Smaller buffer pools, because more data fits in the same size pool
v Reduced I/O activity, because:
– More compressed rows than uncompressed rows fit on a page
– Log records for insert, update, and delete operations of compressed rows are
smaller
v Ability to compress older fragments of time-fragmented data that are not often
accessed, while leaving more recent data that is frequently accessed in
uncompressed form
v Ability to free space no longer needed for a table
v Faster backup and restore
Since compressed data covers fewer pages and has more rows per page than
uncompressed data, the query optimizer might choose different plans after
compression.
For more information, see the IBM Informix Administrator's Guide and the IBM
Informix Administrator's Reference.
Types of indexes
Informix uses B-tree indexes, R-tree indexes, functional indexes, and indexes that
DataBlade modules provide for user-defined data. The server also uses forest of
trees (FOT) indexes, which are alternatives to B-tree indexes.
Related concepts:
“What is a functional index?” on page 7-26
B-tree indexes
Informix uses a B-tree index for columns that contain built-in data types (referred
to as a traditional B-tree index), columns that contain one-dimensional user-defined
data types (referred to as a generic B-tree index), and values that a user-defined data
type returns.
Built-in data types include character, datetime, integer, float, and so forth. For more
information about built-in data types, see IBM Informix Guide to SQL: Reference.
User-defined data types include opaque and distinct data types. For more
information about user-defined data types, see IBM Informix User-Defined Routines
and Data Types Developer's Guide.
For information about how to estimate B-tree index size, see “Estimating index
pages” on page 7-4.
The following figure shows the B-tree structure of an index. The topmost level of
the hierarchy contains a single root page. Intermediate levels, when needed, contain
branch pages. Each branch page contains entries that see a subset of pages in the
next level of the index. The bottom level of the index contains a set of leaf pages.
Each leaf page contains a list of index entries that see rows in the table.
The number of levels needed to hold an index depends on the number of unique
keys in the index and the number of index entries that each page can hold. The
number of entries per page depends, in turn, on the size of the columns being
indexed.
If the index page for a given table can hold 100 keys, a table of up to 100 rows
requires a single index level: the root page. When this table grows beyond 100
rows, to a size between 101 and 10,000 rows, it requires a two-level index: a root
page and between 2 and 100 leaf pages. When the table grows beyond 10,000 rows,
to a size between 10,001 and 1,000,000 rows, it requires a three-level index: the root
page, a set of 100 branch pages, and a set of up to 10,000 leaf pages.
Index entries contained within leaf pages are sorted in key-value order. An index
entry consists of a key and one or more row pointers. The key is a copy of the
indexed columns from one row of data. A row pointer provides an address used to
locate a row that contains the key. A unique index contains one index entry for
every row in the table.
For information about special indexes for Informix, see “Indexes on user-defined
data types” on page 7-21.
Related concepts:
“Forest of trees indexes”
You can create a forest of trees index as an alternative to a B-Tree index, but not as
an alternative to an R-Tree index or other types of indexes.
Unlike a traditional B-tree index, which contains one root node, a forest of trees
index is a large B-Tree index that is divided into smaller subtrees (which you can
think of as buckets). These subtrees contain multiple root nodes and leaves. The
following figure shows the structure of a forest of trees index.
Root nodes
Leaves
Forest of trees indexes are detached indexes. The server does not support forest of
trees attached indexes.
You create a forest of trees index with the CREATE INDEX statement of SQL and
the HASH ON clause.
You enable or disable forest of trees indexes with the SET INDEXES statement of
SQL.
You can identify a forest of trees index by the FOT indicator in the Index Name field
in SET EXPLAIN output.
You can look up the number of hashed columns and subtrees in a forest of trees
index by viewing information in the sysindices table for the database containing
tables that have forest of trees indexes.
The server treats a forest of trees index the same way it treats a B-tree index.
Therefore, in a logged database, you can control how the B-tree scanner threads
remove deletions from both forest of trees and B-tree indexes.
R-tree indexes
Informix uses an R-tree index for spatial data (such as two-dimensional or
three-dimensional data).
For information about sizing an R-tree index, see the IBM Informix R-Tree Index
User's Guide.
For example, the Excalibur Text Search DataBlade provides an index to search text
data. For more information, see the IBM Informix Excalibur Text Search DataBlade.
For more information about the types of data and functions that each DataBlade
module provides, see the user guide of each DataBlade module. For information
about how to determine the types of indexes available in your database, see
“Identifying the available access methods” on page 7-24.
By default, the database server creates the index in the same dbspace as the table,
but in a separate tblspace from the table. To place the index in a separate dbspace,
specify the IN keyword in the CREATE INDEX statement.
Although you cannot explicitly specify the extent size of an index, you can
estimate the number of pages that an index might occupy to determine if your
dbspace or dbspaces have enough space allocated.
The following formula shows how the database server uses the ratio of the index
key size to the row size:
Index extent size = (index_key_size /
table_row_size) *
table_extent_size
In this formula:
v index_key_size is the total widths of the indexed column or columns plus 5 for
a key descriptor.
v table_row_size is the sum of all the columns in the row.
v table_extent_size is the value that you specify in the EXTENT SIZE keyword of
the CREATE TABLE statement.
If the index is not unique, then the extent size is reduced by 20 percent.
The database server also uses this same ratio for the next-extent size for the index:
Index next extent size =
(index_key_size/table_row_size)*
table_next_extent_size
The following formula shows how the database server uses the ratio of the index
key size plus some overhead bytes to the row size:
Detached Index extent size = ( (index_key_size +
9) / table_row_size) *
table_extent_size
Important: For a non-unique index, the formula calculates an extent size that is
reduced by 20 percent.
The preceding estimate is a guideline only. As rows are deleted and new ones are
inserted, the number of index entries can vary within a page. This method for
estimating index pages yields a conservative (high) estimate for most indexes. For
a more precise value, build a large test index with real data and check its size with
the oncheck utility.
Tip: A forest of trees index can be larger than a B-Tree index. When you estimate
the size of a forest of trees index, the estimates apply to each subtree in the index.
Then, you must aggregate the buckets to calculate the total estimation.
When you consider space costs, also consider whether increasing the page size of a
standard or temporary dbspace is beneficial in your environment. If you want a
longer key length than is available for the default page size, you can increase the
page size. If you increase the page size, the size must be an integral multiple of the
default page size, not greater than 16K bytes.
You might not want to increase the page size if your application contains small
sized rows. Increasing the page size for an application that randomly accesses
small rows might decrease performance. In addition, a page lock on a larger page
will lock more rows, reducing concurrency in some situations.
Related concepts:
B-tree index compression (Administrator's Guide)
The following descriptions assume that approximately two pages must be read to
locate an index entry. That is the case when the index consists of a root page, one
level of branch pages, and a set of leaf pages. The root page is assumed to be in a
buffer already. The index for a very large table has at least two intermediate levels,
so about three pages are read when the database server references such an index.
Presumably, one index is used to locate a row being altered. The pages for that
index might be found in page buffers in shared memory for the database server.
However, the pages for any other indexes that need altering must be read from
disk.
Insertions and deletions change the number of entries on a leaf page. Although
virtually every pagents operation requires some additional work to deal with a leaf
page that has either filled or been emptied, if pagents is greater than 100, this
additional work occurs less than 1 percent of the time. You can often disregard it
when you estimate the I/O impact.
In short, when a row is inserted or deleted at random, allow three to four added
page I/O operations per index. When a row is updated, allow six to eight page
I/O operations for each index that applies to an altered column. If a transaction is
rolled back, all this work must be undone. For this reason, rolling back a
transaction can take a long time.
Because the alteration of the row itself requires only two page I/O operations,
index maintenance is clearly the most time-consuming part of data modification.
For information about one way to reduce this cost, see “Clustering” on page 7-11.
You can invoke the B-tree scanner from the command line.
Suppose you have a table that contains a large mailing list. If you find that a
postal-code column is often used to filter a subset of rows, consider putting an
index on that column.
This strategy yields a net savings of time only when the selectivity of the column
is high; that is, when only a small fraction of rows holds any one indexed value.
Nonsequential access through an index takes several more disk I/O operations
than sequential access does, so if a filter expression on the column passes more
than a fourth of the rows, the database server might as well read the table
sequentially.
When a large quantity of rows must be ordered or grouped, the database server
must put the rows in order. One way that the database server performs this task is
to select all the rows into a temporary table and sort the table. But, as explained in
Chapter 10, “Queries and the query optimizer,” on page 10-1, if the ordering
columns are indexed, the optimizer sometimes reads the rows in sorted order
through the index, thus avoiding a final sort.
Because the keys in an index are in sorted sequence, the index really represents the
result of sorting the table. By placing an index on the ordering column or columns,
you can replace many sorts during queries with a single sort when the index is
created.
Placing an index on a column that has low selectivity (that is, a small number of
distinct values relative to the number of rows) can reduce performance. In such
cases, the database server must not only search the entire set of rows that match
the key value, but it must also lock all the affected data and index pages. This
process can impede the performance of other update requests as well.
To correct this problem, replace the index on the low-selectivity column with a
composite index that has a higher selectivity. Use the low-selectivity column as the
leading column and a high-selectivity column as your second column in the index.
The composite index limits the number of rows that the database server must
search to locate and apply an update.
You can use any second column to disperse the key values as long as its value
does not change, or changes at the same time as the real key. The shorter the
second column the better, because its values are copied into the index and expand
its size.
Clustering
Clustering is a method for arranging the rows of a table so that their physical
order on disk closely corresponds to the sequence of entries in the index.
(Do not confuse the clustered index with an optical cluster, which is a method for
storing logically related TEXT or BYTE data together on an optical volume.)
When you know that a table is ordered by a certain index, you can avoid sorting.
You can also be sure that when the table is searched on that column, it is read
effectively in sequential order, instead of nonsequentially. These points are covered
in Chapter 10, “Queries and the query optimizer,” on page 10-1.
In the stores_demo database, the orders table has an index, zip_ix, on the
postal-code column. The following statement causes the database server to put the
rows of the customer table in descending order by postal code:
ALTER INDEX zip_ix TO CLUSTER
To reorder a table, the database server must copy the table. In the preceding
example, the database server reads all the rows in the table and constructs an
index. Then it reads the index entries in sequence. For each entry, it reads the
matching row of the table and copies it to a new table. The rows of the new table
are in the desired sequence. This new table replaces the old table.
Clustering is not preserved when you alter a table. When you insert new rows,
they are stored physically at the end of the table, regardless of their contents.
Clustering can be restored after the order of rows is disturbed by ongoing updates.
The following statement reorders the table to restore data rows to the index
sequence:
ALTER INDEX o_date_ix TO CLUSTER
Reclustering is usually quicker than the original clustering because reading out the
rows of a nearly clustered table is similar in I/O impact to a sequential scan.
Clustering and reclustering take a lot of space and time. To avoid some clustering,
build the table in the desired order initially.
The clust field in the sysindexes or the sysindices table represents the degree of
clustering of the index. The values of several configuration parameters affect the
clust field.
Each of these configuration parameters affects the amount of buffer space available
for a single user session. Additional buffers can result in better clustering (a
smaller clust value in the sysindexes or sysindices tables).
You can create more buffers by performing one or both of the following tasks:
v Increasing the size of the buffer pool by updating the value of the
BUFFERPOOL configuration parameter
v Decreasing the value of the DS_MAX_QUERIES configuration parameter
Related reference:
BUFFERPOOL configuration parameter (Administrator's Reference)
DS_MAX_QUERIES configuration parameter (Administrator's Reference)
Nonunique indexes
In some applications, most table updates can be confined to a single time period.
You might be able to set up your system so that all updates are applied overnight
or on specified dates. Additionally, when updates are performed as a batch, you
can drop all nonunique indexes while you make updates and then create new
indexes afterward. This strategy can improve performance.
The presence of indexes also slows down the population of tables when you use
the LOAD statement or the dbload utility. Loading a table that has no indexes is a
quick process (little more than a disk-to-disk sequential copy), but updating
indexes adds a great deal of overhead.
If you cannot guarantee that the loaded data satisfies all unique constraints, you
must create unique indexes before you load the rows. It saves time if the rows are
presented in the correct sequence for at least one of the indexes. If you have a
choice, make it the row with the largest key. This strategy minimizes the number
of leaf pages that must be read and written.
A forest of trees index differs from a B-tree index in that it has multiple root nodes
and fewer levels. Multiple root nodes can alleviate root node contention, because
more concurrent users can access the index.
If you know that a particular table has a deep tree, you can improve performance
by creating a forest of trees index with fewer levels in the tree. For example,
suppose you create an index where one of the columns is a 100 byte column
containing character data. If you have a large number of rows in that table, the tree
might contain six or seven levels. If you create a forest of trees index instead of a
B-tree index, you can create more than one tree with four levels, so that every
index traversal goes only four levels deep rather than seven levels deep.
Related concepts:
“Forest of trees indexes” on page 7-2
To detect root node contention and determine whether you need a forest of trees
index:
1. Run the onstat -g spi | sort -nr command to display information about spin
locks with long spins.
The output of the onstat -g spi command shows spin locks with waits, which
occur when threads are reading from or writing to an index concurrently and a
particular thread did not succeed in acquiring the lock on the first try.
tabname daily_market_idx
(expression) 0x02d00008
tabname trade_history_idx
(expression) 0x01300003
tabname trade_request_idx2
(expression) 0x0020008E
Related concepts:
“Forest of trees indexes” on page 7-2
Related reference:
onstat -g spi command: Print spin locks with long spins (Administrator's
Reference)
You can monitor onstat -g spi command output to verify that root node contention
no longer occurs. If you identify performance bottlenecks that are caused by highly
contended spin locks, you can rebuild the forest of trees index with more buckets.
Related concepts:
“Forest of trees indexes” on page 7-2
HASH ON clause (SQL Syntax)
Related reference:
CREATE INDEX statement (SQL Syntax)
You can re-enable a disabled forest of trees index, for example, by specifying:
SET INDEXES fotidx ENABLED;
Related concepts:
“Forest of trees indexes” on page 7-2
The CREATE INDEX ONLINE statement enables you to create an index without
having an exclusive lock placed over the table during the duration of the index
build. You can use the CREATE INDEX ONLINE statement even when reads or
updates are occurring on the table. This means index creation can begin
immediately.
When you create an index online, you can use the ONLIDX_MAXMEM
configuration parameter to limit the amount of memory that is allocated to the
preimage log pool and to the updator log pool in shared memory. You might want to
do this if you plan to complete other operations on a table column while executing
the CREATE INDEX ONLINE statement on the column. For more information
about this parameter, see “Limiting memory allocation while creating indexes
online” on page 7-18.
The DROP INDEX ONLINE statement enables you to drop indexes even when
Dirty Read is the transaction isolation level.
The advantages of creating indexes using the CREATE INDEX ONLINE statement
are:
v If a new index is needed to improve the performance of queries on a table, you
can immediately create the index without a lock placed over the table.
v The database server can create an index while a table is being updated.
v The table is available for the duration of the index build.
v The query optimizer can establish better query plans, since the optimizer can
update statistics in unlocked tables.
The advantages of dropping indexes using the DROP INDEX ONLINE statement
are:
v You can drop an inefficient index without disturbing ongoing queries that are
using that index.
v After the index is flagged, the query optimizer will not use the index for new
SELECT operations on tables.
If you initiate a DROP INDEX ONLINE statement for a table that is being updated,
the operation does not occur until after the table update is completed. After you
issue the DROP INDEX ONLINE statement, no one can reference the index, but
concurrent operations can use the index until the operations terminate. The
database server waits to drop the index until all users have finished accessing the
index.
For more information about the CREATE INDEX ONLINE and DROP INDEX
ONLINE statements, see the IBM Informix Guide to SQL: Syntax.
The index creation takes an exclusive lock on the table and waits for all other
concurrent processes scanning the table to quit using the index partitions before
creating the attached index. If the table is being read or updated, the CREATE
INDEX ONLINE statement waits for the exclusive lock for the duration of the lock
mode setting.
You can set the ONLIDX_MAXMEM configuration parameter before starting the
database server, or you can change it dynamically through the onmode -wf and
onmode -wm commands.
Whenever possible, the database server uses parallel processing to improve the
response time of index builds. The number of parallel processes is based on the
number of fragments in the index and the value of the PSORT_NPROCS
environment variable. The database server builds the index with parallel
processing even when the value of PDQ priority is 0.
You can often improve the performance of an index build by taking the following
steps:
1. Set PDQ priority to a value greater than 0 to obtain more memory than the
default 128 kilobytes.
For example, if you estimate that 30 sorts could occur concurrently, the average
row size is 200 bytes, and the average number of rows in a table is 400, you can
estimate the amount of shared memory that the database server needs for sorting
as follows:
30 sorts * 200 bytes * 400 rows = 2,400,000 bytes
Important: You can only use this parameter if the PDQ priority is set to zero. Its
setting has no effect if the PDQ priority is greater than zero.
If the PDQ priority is greater than 0, the maximum amount of shared memory that
the database server allocates for a sort is controlled by the memory grant manager
(MGM). The MGM uses the settings of PDQ priority and the following
configuration parameters to determine how much memory to grant for the sort:
v DS_TOTAL_MEMORY
Chapter 7. Indexes and index performance considerations 7-19
v DS_MAX_QUERIES
v MAX_PDQPRIORITY
For more information about allocating memory for parallel processing, see “The
allocation of resources for parallel database queries” on page 12-7.
To estimate the amount of temporary space needed for an index build, perform the
following steps:
1. Add the total widths of the indexed columns or returned values from
user-defined functions. This value is referred to as colsize.
2. Estimate the size of a typical item to sort with one of the following formulas,
depending on whether the index is attached or not:
a. For a nonfragmented table and a fragmented table with an index created
without an explicit fragmentation strategy, use the following formula:
sizeof_sort_item = keysize + 4
b. For fragmented tables with the index explicitly fragmented, use the
following formula:
sizeof_sort_item =
keysize + 8
3. Estimate the number of bytes needed to sort with the following formula:
temp_bytes = 2 * (rows *
sizeof_sort_item)
This formula uses the factor 2 because everything is stored twice when
intermediate sort runs use temporary space. Intermediate sort runs occur when
not enough memory exists to perform the entire sort in memory.
The value for rows is the total number of rows that you expect to be in the
table.
You can also use this feature to improve query performance over storing each
fragment in a different dbspace when a dbspace is located on a faster device.
For more information, see information about managing partitions in the IBM
Informix Administrator's Guide.
For detailed information about oncheck locking, see the IBM Informix
Administrator's Reference.
You can query the systables system catalog table to see the current lock level of
the table, as the following sample SQL statement shows:
SELECT locklevel FROM systables
WHERE tabname = "customer"
If you do not see a value of R (for row) in the locklevel column, you can modify
the lock level, as the following sample SQL statement shows:
ALTER TABLE tab1 LOCK MODE (ROW);
Row locking might add other side effects, such as an overall increase in lock usage.
For more information about locking levels, see Chapter 8, “Locking,” on page 8-1.
DataBlade modules also provide extended data types and functions to the database
server.
You can define indexes on the following kinds of user-defined data types:
v Opaque data types
An opaque data type is a fundamental data type that you can use to define
columns in the same way you use built-in types. An opaque data type stores a
single value and cannot be divided into components by the database server. For
information about creating opaque data types, see the CREATE OPAQUE TYPE
statement in the IBM Informix Guide to SQL: Syntax and IBM Informix
User-Defined Routines and Data Types Developer's Guide. For more information
about the data types and functions that each DataBlade module provides, see the
user guide of each DataBlade module.
v Distinct data types
A distinct data type has the same representation as an existing opaque or built-in
data type but is different from these types. For information about distinct data
For more information about data types, see the IBM Informix Guide to SQL:
Reference.
The response time for a query might improve when Informix uses an index for:
v Columns used to join two tables
v Columns that are filters for a query
v Columns in an ORDER BY or GROUP BY clause
v Results of functions that are filters for a query
For more information about when the query performance can improve with an
index on a built-in data type, see “Improve performance by adding or removing
indexes” on page 13-20.
To create an index on a user-defined data type, you can use any of the following
secondary-access methods:
v Generic B-tree index
A B-tree index is good for a query that retrieves a range of data values. For
more information, see “B-tree secondary-access method.”
v R-tree index
An R-tree index is good for searches on multidimensional data. For more
information, see the IBM Informix R-Tree Index User's Guide.
v Secondary-access methods that a DataBlade module provides for a new data
type
A DataBlade module that supports a certain type of data can also provide a new
index for that new data type. For more information, see “Using an index that a
DataBlade module provides” on page 7-27.
You can create a functional index on the resulting values of a user-defined function
on one or more columns. For more information, see “Using a functional index” on
page 7-25.
After you choose the desired index type, you might also need to extend an
operator class for the secondary-access method. For more information about how
to extend operator classes, see the IBM Informix User-Defined Routines and Data
Types Developer's Guide.
Tip: For more information about the structure of a B-tree index and how to
estimate the size of a B-tree index, see “Estimating index pages” on page 7-4.
Informix uses the generic B-tree as the built-in secondary-access method. This
built-in secondary-access method is registered in the sysams system catalog table
with the name btree. When you use the CREATE INDEX statement (without the
USING clause) to create an index, the database server creates a generic B-tree
index. For more information, see the CREATE INDEX statement in the IBM
Informix Guide to SQL: Syntax.
Tip: Informix also defines another secondary-access method, the R-tree index. For
more information about how to use an R-tree index, see the IBM Informix R-Tree
Index User's Guide.
A B-tree index is good for a query that retrieves a range of data values. If the data
to be indexed has a logical sequence to which the concepts of less than, greater than,
and equal apply, the generic B-tree index is a useful way to index your data.
Initially, the generic B-tree index supports the relational operators (<,<=,=,>=,>) on
all built-in data types and orders the data in lexicographical sequence.
The optimizer considers whether to use the B-tree index to execute a query if you
define a generic B-tree index on:
v Columns used to join two tables
v Columns that are filters for a query
v Columns in an ORDER BY or GROUP BY clause
v Results of functions that are filters for a query
Initially, the generic B-tree can index data that is one of the built-in data types, and
it orders the data in lexicographical sequence. However, you can extend a generic
B-tree for some other data types.
You can extend a generic B-tree to support columns and functions on the following
data types:
v User-defined data types (opaque and distinct data types) that you want the B-tree
index to support
In this case, you must extend the default operator class of the generic B-tree
index.
v Built-in data types that you want to order in a different sequence from the
lexicographical sequence that the generic B-tree index uses
In this case, you must define a different operator class from the default generic
B-tree index.
An operator class is the set of functions (operators) that are associated with a
nontraditional B-tree index. For more details on operator classes, see “Choosing
operator classes for indexes” on page 7-27.
To identify the secondary-access methods that are available for your database,
query the sysams system catalog table with the following SELECT statement:
SELECT am_id, am_owner, am_name, am_type FROM sysams
WHERE am_type = ’S’;
Important: The sysams system catalog table does not contain a row for the built-in
primary access method. This primary access method is internal to Informix and
does not require a definition in sysams. However, the built-in primary access
method is always available for use.
If you find additional rows in the sysams system catalog table (rows with am_id
values greater than 2), the database supports additional user-defined access
methods. Check the value in the am_type column to determine whether a
user-defined access method is a primary- or secondary-access method.
For more information about the columns of the sysams system catalog table, see
the IBM Informix Guide to SQL: Reference. For information about how to determine
the operator classes that are available in your database, see “Identifying the
available operator classes” on page 7-30.
The secondary-access method that you specify in the USING clause of CREATE
INDEX must already be defined in the sysams system catalog. If the
secondary-access method has not yet been defined, the CREATE INDEX statement
fails.
When you omit the USING clause from the CREATE INDEX statement, the
database server uses B-tree indexes as the secondary-access method. For more
information, see the CREATE INDEX statement in the IBM Informix Guide to SQL:
Syntax.
R-tree indexes:
Informix supports the R-tree index for columns that contain spatial data such as
maps and diagrams. An R-tree index uses a tree structure whose nodes store
pointers to lower-level nodes.
At the leaves of the R-tree are a collection of data pages that store n-dimensional
shapes. For more information about the structure of an R-tree index and how to
estimate the size of an R-tree index, see the IBM Informix R-Tree Index User's Guide.
When you create a functional index, the database server computes the values of
the user-defined function and stores them as key values in the index. When a
change in the table data causes a change in one of the values of an index key, the
database server automatically updates the functional index.
You can use a functional index for functions that return values of both
user-defined data types (opaque and distinct) and built-in data types. However,
you cannot define a functional index if the function returns a simple-large-object
data type (TEXT or BYTE).
For more information about the types of indexes, see “Defining indexes for
user-defined data types” on page 7-22. For information about space requirements
for functional indexes, see “Estimating index pages” on page 7-4.
Related concepts:
“Types of indexes” on page 7-1
The optimizer considers whether to use a functional index to access the results of
functions that are in a SELECT clause or are in the filters in the WHERE clause.
The optimizer can now consider the functional index when you specify the
darkness() function as a filter in the query:
SELECT count(*) FROM photos WHERE
darkness(picture) > 0.5
You can also create a composite index with user-defined functions. For more
information, see “Use composite indexes” on page 13-20.
For example, the Excalibur Text Search DataBlade module provides an index to
search text data. For more information, see the Excalibur Text Search DataBlade
Module User's Guide.
For more information about the types of data and functions that each DataBlade
module provides, see the user guide for the DataBlade module. For information
about how to determine the types of indexes available in your database, see
“Identifying the available access methods” on page 7-24.
For more information about how to extend an operator class, see IBM Informix
User-Defined Routines and Data Types Developer's Guide.
The query optimizer for the database server uses an operator class to determine if
an index can process the query with the least cost. An operator class indicates two
things to the query optimizer:
v Which functions that appear in an SQL statement can be evaluated with a given
index
These functions are called the strategy functions for the operator class.
v Which functions the index uses to evaluate the strategy functions
These functions are called the support functions for the operator class.
With the information that the operator class provides, the query optimizer can
determine whether a given index is applicable to the query. The query optimizer
can consider whether to use the index for the given query when the following
conditions are true:
v An index exists on the particular column or columns in the query.
v For the index that exists, the operation on the column or columns in the query
matches one of the strategy functions in the operator class associated with the
index.
The query optimizer reviews the available indexes for the table or tables and
matches the index keys with the column specified in the query filter. If the column
in the filter matches an index key, and the function in the filter is one of the
strategy functions of the operator class, the optimizer includes the index when it
determines which query plan has the lowest execution cost. In this manner, the
optimizer can determine which index can process the query with the least cost.
Informix uses the strategy functions of a secondary-access method to help the query
optimizer determine whether a specific index is applicable to a specific operation
on a data type.
If an index exists and the operator in the filter matches one of the strategy
functions in the operator class, the optimizer considers whether to use the index
for the query.
Each secondary-access method has a default operator class associated with it. By
default, the CREATE INDEX statement associates the default operator class with an
index.
For example, the following CREATE INDEX statement creates a B-tree index on the
postalcode column and automatically associates the default B-tree operator class
with this column:
CREATE INDEX postal_ix ON customer(postalcode)
For more information about how to specify a new default operator class for an
index, see “User-defined operator classes” on page 7-31.
By default, the CREATE INDEX statement associates the btree_ops operator class
with it when you create a B-tree index. For example, the following CREATE
INDEX statement creates a generic B-tree index on the order_date column of the
orders table and associates with this index the default operator class for the B-tree
secondary-access method:
CREATE INDEX orddate_ix ON orders (order_date)
The btree_ops operator class defines the names of strategy functions for the btree
access method.
The strategy functions that the btree_ops operator class defines are:
v lessthan (<)
v lessthanorequal (<=)
v equal (=)
v greaterthanorequal (>=)
v greaterthan (>)
These strategy functions are all operator functions. That is, each function is
associated with an operator symbol; in this case, with a relational-operator symbol.
For more information about relational-operator functions, see the IBM Informix
User-Defined Routines and Data Types Developer's Guide.
When the query optimizer examines a query that contains a column, it checks to
see if this column has a B-tree index defined on it. If such an index exists and if the
query contains one of the relational operators that the btree_ops operator class
supports, the optimizer can choose a B-tree index to execute the query.
The btree_ops operator class has one support function, a comparison function
called compare(). The btree_ops operator class has one support function, a
comparison function called compare().
The B-tree secondary-access method uses the compare() function to traverse the
nodes of the generic B-tree index. To search for data values in a generic B-tree
index, the secondary-access method uses the compare() function to compare the
key value in the query to the key value in an index node. The result of the
comparison determines if the secondary-access method needs to search the
next-lower level of the index or if the key resides in the current node.
The generic B-tree access method also uses the compare() function to perform the
following tasks for generic B-tree indexes:
v Sort the keys before the index is built
v Determine the linear order of keys in a generic B-tree index
v Evaluate the relational operators
v Search for data values in an index
The database server uses the compare() function to evaluate comparisons in the
SELECT statement. To provide support for these comparisons for opaque data
types, you must write the compare() function. For more information, see the IBM
Informix User-Defined Routines and Data Types Developer's Guide.
The database server also uses the compare() function when it uses a B-tree index to
process an ORDER BY clause in a SELECT statement. However, the optimizer does
not use the index to perform an ORDER BY operation if the index does not use the
btree-ops operator class.
The database server provides the default operator class for the built-in
secondary-access method, the generic B-tree index. In addition, your environment
might have installed DataBlade modules that implement other operator classes. All
operator classes are defined in the sysopclasses system catalog table.
To identify the operator classes that are available for your database, query the
sysopclasses system catalog table with the following SELECT statement:
SELECT opclassid, opclassname, amid, am_name
FROM sysopclasses, sysams
WHERE sysopclasses.amid = sysams.am_id
If you find additional rows in the sysopclasses system catalog table (rows with
opclassid values greater than 2), your database supports user-defined operator
classes. Check the value in the amid column to determine the secondary-access
methods to which the operator class belongs.
The am_defopclass column in the sysams system catalog table stores the
operator-class identifier for the default operator class of a secondary-access
method. To determine the default operator class for a given secondary-access
method, you can run the following query:
SELECT am_id, am_name, am_defopclass, opclass_name
FROM sysams, sysopclasses
WHERE sysams.am_defopclass = sysopclasses.opclassid
By default, the database server provides the following default operator classes.
For more information about the columns of the sysopclasses and sysams system
catalog tables, see the IBM Informix Guide to SQL: Reference. For information about
how to determine the access methods that are available in your database, see
“Identifying the available access methods” on page 7-24.
Each part of a composite index can specify a different operator class. You choose
the operator classes when you create the index. In the CREATE INDEX statement,
you specify the name of the operator class to use after each column or function
name in the index-key specification. Each name must be listed in the opclassname
column of the sysopclasses system catalog table and must be associated with the
secondary-access method that the index uses.
The operator class that you specify in the CREATE INDEX statement must already
be defined in the sysopclasses system catalog with the CREATE OPCLASS
statement. If the operator class has not yet been defined, the CREATE INDEX
statement fails. For information about how to create an operator class, see IBM
Informix User-Defined Routines and Data Types Developer's Guide.
Locks
A lock is a software mechanism that you can set to prevent others from using a
resource. You can place a lock on a single row or key, a page of data or index keys,
a whole table, or an entire database.
Additional types of locks are available for smart large objects. For more
information, see “Locks for smart large objects” on page 8-16.
Locking granularity
The level and type of information that the lock protects is called locking granularity.
Locking granularity affects performance.
When a user cannot access a row or key, the user can wait for another user to
unlock the row or key. If a user locks an entire page, a higher probability exists
that more users will wait for a row in the page.
The ability of more than one user to access a set of rows is called concurrency. The
goal of the database administrator is to increase concurrency to increase total
performance without sacrificing performance for an individual user.
For an operation that changes a large number of rows, consider “Page locks” on
page 8-2.
The default locking mode is page-locking. If you want row or key locks, you must
create the table with row locking on or alter the table.
The following example shows how to create a table with row locking on:
CREATE TABLE customer(customer_num serial, lname char(20)...)
LOCK MODE ROW;
The ALTER TABLE statement can also change the lock mode.
When the lock mode is ROW and you insert or update a row, the database server
creates a row lock. In some cases, you place a row lock by simply reading the row
with a SELECT statement.
Key-value locks
When a user deletes a row within a transaction, the row cannot be locked because
it does not exist. However, the database server must somehow record that a row
existed until the end of the transaction. The database server uses key-value locking
to lock the deleted row.
When the database server deletes a row, key values in the indexes for the table are
not removed immediately. Instead, each key value is marked as deleted, and a lock
is placed on the key value.
Other users might encounter key values that are marked as deleted. The database
server must determine whether a lock exists. If a lock exists, the delete has not
been committed, and the database server sends a lock error back to the application
(or it waits for the lock to be released if the user executed SET LOCK MODE TO
WAIT).
One of the most important uses for key-value locking is to assure that a unique
key remains unique through the end of the transaction that deleted it. Without this
protection mechanism, user A might delete a unique key within a transaction, and
user B might insert a row with the same key before the transaction commits. This
scenario makes rollback by user A impossible. Key-value locking prevents user B
from inserting the row until the end of user A's transaction.
Page locks
Page locking is the default mode when you create a table without the LOCK
MODE clause. With page locking, instead of locking only the row, the database
server locks the entire page that contains the row. If you update several rows on
the same page, the database server uses only one lock for the page.
When you insert or update a row, the database server creates a page lock on the
data page. In some cases, the database server creates a page lock when you simply
read the row with a SELECT statement.
When you insert, update, or delete a key (performed automatically when you
insert, update, or delete a row), the database server creates a lock on the page that
contains the key in the index.
Page locks are useful for tables in which the normal user changes a large number
of rows at one time. For example, an orders table that holds orders that are
commonly inserted and queried individually is not a good candidate for page
locking. But a table that holds old orders and is updated nightly with all of the
orders placed during the day might be a good candidate. In this case, the type of
isolation level that you use to access the table is important. For more information,
see “Isolation level” on page 8-5.
Another important distinction between these two types of table locks is the actual
number of locks placed:
v In shared mode, the database server places one shared lock on the table, which
informs other users that no updates can be performed. In addition, the database
server adds locks for every row updated, deleted, or inserted.
v In exclusive mode, the database server places only one exclusive lock on the
table, no matter how many rows it updates. If you update most of the rows in
the table, place an exclusive lock on the table.
Important: A table lock on a table can decrease update concurrency radically. Only
one update transaction can access that table at any given time, and that update
transaction locks out all other transactions. However, multiple read-only
transactions can simultaneously access the table. This behavior is useful in a data
warehouse environment where the data is loaded and then queried by multiple
users.
You can switch a table back and forth between table-level locking and the other
levels of locking. This ability to switch locking levels is useful when you use a
table in a data warehouse mode during certain time periods but not in others.
A transaction tells the database server to use table-level locking for a table with the
LOCK TABLE statement. The following example places an exclusive lock on the
table:
LOCK TABLE tab1 IN EXCLUSIVE MODE;
In some cases, the database server places its own table locks. For example, if the
isolation level is Repeatable Read, and the database server must read a large
portion of the table, it places a table lock automatically instead of setting row or
page locks. The database server places a table lock on a table when it creates or
drops an index.
Database locks
You can place a lock on the entire database when you open the database with the
DATABASE statement. A database lock prevents read or update access by anyone
but the current user.
If you know that most of your applications might benefit from a lock mode of row,
you can take one of the following actions:
v Use the LOCK MODE ROW clause in each CREATE TABLE statement or ALTER
TABLE statement.
v Set the IFX_DEF_TABLE_LOCKMODE environment variable to ROW so that all
tables you subsequently create within a session use ROW without the need to
specify the lock mode in the CREATE TABLE statement or ALTER TABLE
statement.
v Set the DEF_TABLE_LOCKMODE configuration parameter to ROW so that all
tables subsequently created within the database server use ROW without the
need to specify the lock mode in the CREATE TABLE statement or ALTER
TABLE statement.
In addition, if you previously changed the lock mode of a table to ROW, and
subsequently execute an ALTER TABLE statement to alter some other characteristic
of the table (such as add a column or change the extent size), you do not need to
specify the lock mode. The lock mode remains at ROW and is not set to the default
PAGE mode.
You can still override the lock mode of individual tables by specifying the LOCK
MODE clause in the CREATE TABLE statement or ALTER TABLE statement.
The following list shows the order of precedence for the lock mode on a table:
v The system default is page locks. The database server uses this system default if
you do not set the configuration parameter, do not set the environment variable,
or do not specify the LOCK MODE clause in the SQL statements.
v If you set the DEF_TABLE_LOCKMODE configuration parameter, the database
server uses this value when you do not set the environment variable, or do not
specify the LOCK MODE clause in the SQL statements.
v If you set the IFX_DEF_TABLE_LOCKMODE environment variable, this value
overrides the DEF_TABLE_LOCKMODE configuration parameter and system
default. The database server uses this value when you do not specify the LOCK
MODE clause in the SQL statements.
v If you specify the LOCK MODE clause in the CREATE TABLE statement or
ALTER TABLE statement, this value overrides the
IFX_DEF_TABLE_LOCKMODE, the DEF_TABLE_LOCKMODE configuration
parameter and system default.
To suspend the current process until the lock releases, run the following SQL
statement :
SET LOCK MODE TO WAIT;
You can also specify the maximum number of seconds that a process waits for a
lock to be released before issuing an error. In the following example, the database
server waits for 20 seconds before issuing an error:
SET LOCK MODE TO WAIT 20;
To return to the default behavior (no waiting for locks), execute the following
statement:
SET LOCK MODE TO NOT WAIT;
Isolation level
The number and duration of locks placed on data during a SELECT statement
depend on the level of isolation that the user sets. The type of isolation can affect
overall performance because it affects concurrency.
Before you execute a SELECT statement, you can set the isolation level with the
SET ISOLATION statement, which is an Informix extension to the ANSI SQL-92
standard, or with the ANSI/ISO-compliant SET TRANSACTION. The main
differences between the two statements are that SET ISOLATION has an additional
isolation level, Cursor Stability, and SET TRANSACTION cannot be executed more
than once in a transaction as SET ISOLATION can. The SET ISOLATION statement
is an Informix extension to the ANSI SQL-92 standard. The SET ISOLATION
statement can change the enduring isolation level for the session
Use Dirty Read isolation with care if update activity occurs at the same time. With
Dirty Read, the reader can read a row that has not been committed to the database
and might be eliminated or changed during a rollback. For example, consider the
following scenario:
User 1 starts a transaction.
User 1 inserts row A.
User 2 reads row A.
User 1 rolls back row A.
Because the database server does not check or place any locks for queries, Dirty
Read isolation offers the best performance of all isolation levels. However, because
of potential problems with uncommitted data that is rolled back, use Dirty Read
isolation with care.
Because problems with uncommitted data that is rolled back are an issue only with
transactions, databases that do not have transaction (and hence do not allow
transactions) use Dirty Read as a default isolation level. In fact, Dirty Read is the
only isolation level allowed for databases that do not have transaction logging.
The database server does not actually place any locks for rows read during
Committed Read. It simply checks for any existing rows in the internal lock table.
Committed Read is the default isolation level for databases with logging if the log
mode is not ANSI-compliant. For databases created with a logging mode that is
not ANSI-compliant, Committed Read is an appropriate isolation level for most
activities. For ANSI-compliant databases, Repeatable Read is the default isolation
level.
In the Committed Read isolation level, locks held by other sessions can cause SQL
operations to fail if the current session cannot acquire a lock or if the database
server detects a deadlock. (A deadlock occurs when two users hold locks, and each
user wants to acquire a lock that the other user owns.) The LAST COMMITTED
keyword option to the SET ISOLATION COMMITTED READ statement of SQL
reduces the risk of locking conflicts.
For databases created with transaction logging, you can set the
USELASTCOMMITTED configuration parameter to specify whether the database
server uses the last committed version of the data, rather than wait for the lock to
be released, when sessions using the Dirty Read or Committed Read isolation level
(or the ANSI/ISO level of Read Uncommitted or Read Committed) attempt to read
a row on which a concurrent session holds a shared lock. The last committed
version of the data is the version of the data that existed before any updates
occurred.
In the example for a cursor in Figure 8-1, at fetch a row the database server releases
the lock on the previous row and places a lock on the row being fetched. At close
the cursor, the server releases the lock on the last row.
If you do not use a cursor to fetch data, Cursor Stability isolation behaves in the
same way as Committed Read. No locks are actually placed.
The example in Figure 8-2 on page 8-8 shows when the database server places and
releases locks for a repeatable read. At fetch a row, the server places a lock on the
row being fetched and on every row it examines in order to retrieve this row. At
close the cursor, the server releases the lock on the last row.
Repeatable Read is useful during any processing in which multiple rows are
examined, but none must change during the transaction. For example, suppose an
application must check the account balance of three accounts that belong to one
person. The application gets the balance of the first account and then the second.
But, at the same time, another application begins a transaction that debits the third
account and credits the first account. By the time that the original application
obtains the account balance of the third account, it has been debited. However, the
original application did not record the debit of the first account.
When you use Committed Read or Cursor Stability, the previous scenario can
occur. However, it cannot occur with Repeatable Read. The original application
holds a read lock on each account that it examines until the end of the transaction,
so the attempt by the second application to change the first account fails (or waits,
depending upon SET LOCK MODE).
Because even examined rows are locked, if the database server reads the table
sequentially, a large number of rows unrelated to the query result can be locked.
For this reason, use Repeatable Read isolation for tables when the database server
can use an index to access a table. If an index exists and the optimizer chooses a
sequential scan instead, you can use directives to force use of the index. However,
forcing a change in the query path might negatively affect query performance.
Use one of the following methods to prevent concurrency problems when other
users are modifying a nonlogging table:
v Lock the table in exclusive mode for the whole transaction.
v Use Repeatable Read isolation level for the whole transaction.
Important: Nonlogging raw tables are intended for fast loading of data. You
should change the table to STANDARD before you use it in a transaction or
modify the data within it.
Update cursors
An update cursor is a special kind of cursor that applications can use when the
row might potentially be updated. Update cursors use promotable locks in which the
database server places an update lock on the row when the application fetches the
row. The lock is changed to an exclusive lock when the application uses an update
cursor and UPDATE...WHERE CURRENT OF to update the row.
In some cases, the database server might place locks on rows that the database
server has examined but not actually fetched. Whether this behavior occurs
depends on how the database server executes the SQL statement.
The advantage of an update cursor is that you can view the row with the
confidence that other users cannot change it or view it with an update cursor
while you are viewing it and before you update it.
If you do not update the row, the default behavior of the database server is to
release the update lock when you execute the next FETCH statement or close the
cursor. However, if you execute the SET ISOLATION statement with the RETAIN
UPDATE LOCKS clause, the database server does not release any currently
existing or subsequently placed update locks until the end of the transaction.
The code in Figure 8-3 shows when the database server places and releases update
locks with a cursor. At fetch row 1, the database server places an update lock on
row 1. At fetch row 2, the server releases the update lock on row 1 and places an
update lock on row 2. However, after the database server executes the SET
ISOLATION statement with the RETAIN UPDATE LOCKS clause, it does not
release any update locks until the end of the transaction. At fetch row 3, it places an
update lock on row 3. At fetch row 4, it places an update lock on row 4. At commit
work, the server releases the update locks for rows 2, 3, and 4.
The code in Figure 8-4 on page 8-10 shows the database server promoting an
update lock to an exclusive lock. At fetch the row, the server places an update lock
on the row being fetched. At update the row, the server promotes the lock to
exclusive. At commit work, it releases the lock.
In addition, no other users can view the row unless they are using the Dirty Read
isolation level.
When the database server removes the exclusive lock depends on whether the
database supports transaction logging:
v If the database supports logging, the database server removes all exclusive locks
when the transaction completes (commits or rolls back).
v If the database does not support logging, the database server removes all
exclusive locks immediately after the INSERT, MERGE, UPDATE, or DELETE
statement completes, except when the lock is on the row that is currently being
fetched into an update cursor.
In this situation, the lock is retained during the fetch operation on the row, but
only until the server fetches the next row, or until the server updates the current
row by promoting the lock to an exclusive lock.
In a nonlogging database, the promotable update lock on a row fetched for update
can be released by a DDL operation on the database while the INSERT, MERGE,
UPDATE, or DELETE statement that originally created the lock is still running. To
reduce the risk of data corruption if a concurrent session modifies the unlocked
row, restrict operations that use promotable update locks to databases that support
transaction logging.
The following table shows the types of locks that the lock table can contain.
In addition, the lock table might store intent locks, with the same lock type as
previously shown. In some cases, a user might need to register his or her possible
intent to lock an item, so that other users cannot place a lock on the item.
Depending on the type of operation and the isolation level, the database server
might continue to read the row and place its own lock on the row, or it might wait
for the lock to be released (if the user executed SET LOCK MODE TO WAIT). The
following table shows the locks that a user can place if another user holds a certain
type of lock. For example, if one user holds an exclusive lock on an item, another
user requesting any kind of lock (exclusive, update, or shared) receives an error.
Monitoring locks
You can analyze information about locks and monitor locks by viewing
information in the internal lock table that contains stored locks.
View the lock table with onstat -k. Figure 8-5 shows sample output for onstat -k.
Locks
address wtlist owner lklist type tblsnum rowid key#/bsiz
300b77d0 0 40074140 0 HDR+S 10002 106 0
300b7828 0 40074140 300b77d0 HDR+S 10197 123 0
300b7854 0 40074140 300b7828 HDR+IX 101e4 0 0
300b78d8 0 40074140 300b7854 HDR+X 101e4 102 0
4 active, 5000 total, 8192 hash buckets
In this example, a user is inserting one row in a table. The user holds the following
locks (described in the order shown):
v A shared lock on the database
v A shared lock on a row in the systables system catalog table
v An intent-exclusive lock on the table
v An exclusive lock on the row
To determine the table to which the lock applies, execute the following SQL
statement. For tblsnum, substitute the value shown in the tblsnum field in the
onstat -k output.
SELECT *
FROM SYSTABLES
WHERE HEX(PARTNUM) = "tblsnum";
You can also query the syslocks table in the sysmaster database to obtain
information about each active lock. The syslocks table contains the following
columns.
Column Description
dbsname Database on which the lock is held
tabname Name of the table on which the lock is held
rowidlk ID of the row on which the lock is held (0
indicates a table lock.)
keynum The key number for the row
type Type of lock
owner Session ID of the lock owner
waiter Session ID of the first waiter on the lock
For information about how to determine an initial value for the LOCKS
configuration parameter, see “The LOCKS configuration parameter and memory
utilization” on page 4-15.
If the number of locks needed by sessions exceeds the value set in the LOCKS
configuration parameter, the database server attempts to increase the lock table by
doubling its size. Each time that the lock table overflows (when the number of
locks needed is greater than the current size of the lock table), the database server
increases the size of the lock table, up to 99 times. Each time that the database
server increases the size of the lock table, the server attempts to double its size.
However, the server will limit each actual increase to no more than the maximum
number of added locks shown in Table 8-1. After the 99th time that the database
server increases the lock table, the server no longer increases the size of the lock
table, and an application needing a lock receives an error.
Every time the database server increases the size of the lock table, the server places
a message in the message log file. You should monitor the message log file
periodically and increase the size of the LOCKS configuration parameter if you see
that the database server has increased the size of the lock table.
To monitor the number of times that applications receive the out-of-locks error,
view the ovlock field in the output of onstat -p. You can also see similar
information from the sysprofile table in the sysmaster database. The following
rows contain the relevant statistics.
Row Description
ovlock Number of times that sessions attempted to
exceed the maximum number of locks
lockreqs Number of times that sessions requested a
lock
lockwts Number of times that sessions waited for a
lock
If the database server is using an unusually large number of locks, you can
examine how individual applications are using locks, as follows:
1. Monitor sessions with onstat -u to see if a particular user is using an especially
high number of locks (a high value in the locks column).
2. If a particular user uses a large number of locks, examine the SQL statements in
the application to determine whether you should lock the table or use
individual row or page locks.
A table lock is more efficient than individual row locks, but it reduces concurrency.
One way to reduce the number of locks placed on a table is to alter a table to use
page locks instead of row locks. However, page locks reduce overall concurrency
for the table, which can affect performance.
You can also reduce the number of locks placed on a table by locking the table in
exclusive mode.
Related concepts:
“The LOCKS configuration parameter and memory utilization” on page 4-15
If the application executes SET LOCK MODE TO WAIT, the database server waits
for a lock to be released instead of returning an error. An unusually long wait for a
lock can give users the impression that the application is hanging.
In Figure 8-6 on page 8-14, the onstat -u output shows that session ID 84 is waiting
for a lock (L in the first column of the Flags field). To find out the owner of the
onstat -u
Userthreads
address flags sessid user tty wait tout locks nreads nwrites
40072010 ---P--D 7 informix - 0 0 0 35 75
400723c0 ---P--- 0 informix - 0 0 0 0 0
40072770 ---P--- 1 informix - 0 0 0 0 0
40072b20 ---P--- 2 informix - 0 0 0 0 0
40072ed0 ---P--F 0 informix - 0 0 0 0 0
40073280 ---P--B 8 informix - 0 0 0 0 0
40073630 ---P--- 9 informix - 0 0 0 0 0
400739e0 ---P--D 0 informix - 0 0 0 0 0
40073d90 ---P--- 0 informix - 0 0 0 0 0
40074140Y-BP---81 lsuto 4 50205788 0 4 106 221
400744f0 --BP--- 83 jsmit - 0 0 4 0 0
400753b0 ---P--- 86 worth - 0 0 2 0 0
40075760 L--PR--84 jones 3 300b78d8 -1 2 0 0
13 active, 128 total, 16 maximum concurrent
onstat -k
Locks
address wtlist owner lklist type tblsum rowid key#/bsiz
300b77d0 0 40074140 0 HDR+S 10002 106 0
300b7828 0 40074140 300b77d0 HDR+S 10197 122 0
300b7854 0 40074140 300b7828 HDR+IX 101e4 0 0
300b78d84007576040074140300b7854 HDR+X 101e4 100 0
300b7904 0 40075760 0 S 10002 106 0
300b7930 0 40075760 300b7904 S 10197 122 0
6 active, 5000 total, 8192 hash buckets
To find out the owner of the lock for which session ID 84 is waiting:
1. Obtain the address of the lock in the wait field (300b78d8) of the onstat -u
output.
2. Find this address (300b78d8) in the Locks address field of the onstat -k output.
The owner field of this row in the onstat -k output contains the address of the
user thread (40074140).
3. Find this address (40074140) in the Userthreads field of the onstat -u output.
The sessid field of this row in the onstat -u output contains the session ID (81)
that owns the lock.
To eliminate the contention problem, you can have the user exit the application
gracefully. If this solution is not possible, you can stop the application process or
remove the session with onmode -z.
For example, user pradeep holds a lock on row 10. User jane holds a lock on row
20. Suppose that jane wants to place a lock on row 10, and pradeep wants to place
a lock on row 20. If both users execute SET LOCK MODE TO WAIT, they
potentially might wait for each other forever.
Informix uses the lock table to detect deadlocks automatically and stop them
before they occur. Before a lock is granted, the database server examines the lock
list for each user. If a user holds a lock on the resource that the requestor wants to
lock, the database server traverses the lock wait list for the user to see if the user is
waiting for any locks that the requestor holds. If so, the requestor receives a
deadlock error.
Deadlock errors can be unavoidable when applications update the same rows
frequently. However, certain applications might always be in contention with each
other. Examine applications that are producing a large number of deadlocks and
try to run them at different times.
To monitor the number of deadlocks, use the deadlks field in the output of onstat
-p.
In a distributed transaction, the database server does not examine lock tables from
other database server systems, so deadlocks cannot be detected before they occur.
Instead, you can set the DEADLOCK_TIMEOUT configuration parameter.
DEADLOCK_TIMEOUT specifies the number of seconds that the database server
waits for a remote database server response before it returns an error. Although
reasons other than a distributed deadlock might cause the delay, this mechanism
keeps a transaction from hanging indefinitely.
To monitor the number of distributed deadlock timeouts, use the dltouts field in
the onstat -p output.
The following table summarizes the values in the IsoLvl column in onstat -g ses
and onstat -g sql output.
Value Description
DR Dirty Read
CR Committed Read
CS Cursor Stability
CRU Committed Read with RETAIN UPDATE LOCKS
CSU Cursor Stability with RETAIN UPDATE LOCKS
DRU Dirty Read with RETAIN UPDATE LOCKS
LC Committed Read, Last Committed
RR Repeatable Read
The database server uses one of the following granularity levels for locking smart
large objects:
v The sbspace chunk header partition
v The smart large object
v A byte range of the smart large object
The default locking granularity is at the level of the smart large object. In other
words, when you update a smart large object, by default the database server locks
the smart large object that is being updated.
Locks on the sbspace chunk header partition only occur when the database server
promotes locks on smart large objects. For more information, see “Lock promotion”
on page 8-19.
Byte-range locking
Rather than locking the entire smart large object, you can lock only a specific byte
range of a smart large object.
Figure 8-7 on page 8-17 shows two locks placed on a single smart large object. The
first lock is on bytes 2, 3, and 4. The second lock is on byte 6 alone.
If you place a second lock on a byte range adjacent to a byte range that is currently
locked, the database server consolidates the two locks into one lock on the entire
range.
If a user holds locks that the Figure 8-7 shows, and the user requests a lock on byte
five, the database server consolidates the locks placed on bytes two through six
into one lock.
Likewise, if a user unlocks only a portion of the bytes included within a byte-range
lock, the database server might be split into multiple byte-range locks. In the
Figure 8-7 the user could unlock byte three, which causes the database server to
change the one lock on bytes two through four to one lock on byte two and one
lock on byte four.
To use byte-range locks, you must perform one of the following actions:
v To set byte-range locking for the sbspace that stores the smart large object, use
the onspaces utility. The following example sets byte-range locking for the new
sbspace:
onspaces -c -S slo -g 2 -p /ix/9.2/liz/slo -o 0 -s 1000
-Df LOCK_MODE=RANGE
When you set the default locking mode for the sbspace to byte-range locking,
the database server locks only the necessary bytes when it updates any smart
large objects stored in the sbspace.
v To set byte-range locking for the smart large object when you open it, use one of
the following methods:
– In DB-Access: Set the MI_LO_LOCKRANGE flag in the mi_lo_open()
DataBlade API function.
– In ESQL/C: Set the LO_LOCKRANGE flag in the ifx_lo_open() Informix
ESQL/C function. When you set byte-range locking for the individual smart
large object, the database server implicitly locks only the necessary bytes
when it selects or updates the smart large object.
Byte-Range Locks
rowid/LOid tblsnum address status owner offset size type
104 200004 a020e90 HDR
[2, 2, 3] a020ee4 HOLD a1b46d0 50 10 S
202 200004 a021034 HDR
[2, 2, 5] a021088 HOLD a1b51e0 40 5 S
102 200004 a035608 HDR
[2, 2, 1] a0358fc HOLD a1b4148 0 500 S
a035758 HOLD a1b3638 300 100 S
21 active, 2000 total, 2048 hash buckets
Column Description
rowid The rowid of the row that contains the
locked smart large object
LOid The three values: sbspace number, chunk
number, and sequence number (a value that
represents the position in the chunk)
tblsnum The number of the tblspace that holds the
smart large object
address The address of the lock
status The status of the lock
Be sure to monitor the number of locks used with onstat -k, so you can determine
if you need to increase the value of the LOCKS configuration parameter.
Lock promotion
The database server uses lock promotion to decrease the total number of locks held
on smart large objects. Too many locks can result in poorer performance because
the database server frequently searches the lock table to determine if a lock exists
on an object.
If the number of locks that a user holds on a smart large object (not on byte ranges
of a smart large object) equals or exceeds 10 percent of the current capacity of the
lock table, the database server attempts to promote all of the smart-large-object
locks to one lock on the smart-large-object header partition. This kind of lock
promotion improves performance for applications that are updating, loading, or
deleting a large number of smart large objects. For example, a transaction that
deletes millions of smart large objects would consume the entire lock table if the
database server did not use lock promotion. The lock promotion algorithm has
deadlock avoidance built in.
Even if the database server attempts to promote a lock, it might not be able to do
so. For example, the database server might not be able to promote byte-range locks
to one smart-large-object lock because other users have byte-range locks on the
same smart large object. If the database server cannot promote a byte-range lock, it
does not change the lock, and processing continues as normal.
For information about how Dirty Reads affects consistency, see “Dirty Read
isolation” on page 8-5.
Set the Dirty Read isolation level for smart large objects in one of the following
ways:
v Use the SET TRANSACTION MODE or SET ISOLATION statement.
v Use the LO_DIRTY_READ flag in one of the following functions:
– For DB-Access:mi_lo_open()
– For ESQL/C:ifx_lo_open()
If consistency for smart large objects is not important, but consistency for other
columns in the row is important, you can set the isolation level to Committed
Read, Cursor Stability, or Repeatable Read and open the smart large object with the
LO_DIRTY_READ flag.
The database server supports table fragmentation (also partitioning), which allows
you to store data from a single table on multiple disk devices. Fragmentation
enables you to define groups of rows or index keys within a table according to
some algorithm or scheme. You can store each group or fragment (also referred to
as a partition) in a separate dbspace associated with a specific physical disk.
For information about fragmentation and parallel execution, see Chapter 12,
“Parallel database query (PDQ),” on page 12-1.
For an introduction to fragmentation concepts and methods, see the IBM Informix
Database Design and Implementation Guide. For information about the SQL statements
that manage fragments, see the IBM Informix Guide to SQL: Syntax.
When you plan a fragmentation strategy, be aware of these space and page issues:
v Although a 4-terabyte chunk can be on a 2-kilobyte page, only 32 gigabytes can
be utilized in a dbspace because of a rowid format limitation.
v For a fragmented table, all fragments must use the same page size.
v For a fragmented index, all fragments must use the same page size.
v A table can be in one dbspace and the index for that table can be in another
dbspace. These dbspaces do not need to have the same page size.
Fragmentation goals
You can analyze your application and workload to identify fragmentation goals
and to determine the balance to strike among fragmentation goals.
If queries access data by performing an index read, you can improve performance
by using the same distribution scheme for the index and the table.
For more information about improving performance for queries, see “Query
expressions for fragment elimination” on page 9-15 and Chapter 13, “Improving
individual query performance,” on page 13-1.
For tables subjected to this type of load, fragment both the index keys and data
rows with a distribution scheme that allows each query to eliminate unneeded
fragments from its scan. Use an expression-based distribution scheme. For more
information, see “Distribution schemes that eliminate fragments” on page 9-14.
Your success in reducing contention depends on how much you know about the
distribution of data in the table and the scheduling of queries against the table. For
example, if the distribution of queries against the table is set up so that all rows
are accessed at roughly the same rate, try to distribute rows evenly across the
fragments. However, if certain values are accessed at a higher rate than others, you
can compensate for this difference by distributing the rows over the fragments to
balance the access rate. For more information, see “Designing an expression-based
distribution scheme” on page 9-8.
This availability has important implications for the following types of applications:
v Applications that do not require access to unavailable fragments
A query that does not require the database server to access data in an
unavailable fragment can still successfully retrieve data from fragments that are
available. For example, if the distribution expression uses a single column, the
database server can determine if a row is contained in a fragment without
accessing the fragment. If the query accesses only rows that are contained in
available fragments, a query can succeed even when some of the data in the
table is unavailable. For more information, see “Designing an expression-based
distribution scheme” on page 9-8.
For more information about backup and restore, see your IBM Informix Backup and
Restore Guide.
For details about placement issues that apply to tables, see Chapter 6, “Table
performance considerations,” on page 6-1.
Decision-support queries usually create and access large temporary files; placement
of temporary dbspaces is a critical factor for performance. For more information
about placement of temporary files, see “Spreading temporary tables and sort files
across multiple disks” on page 6-4.
The distribution scheme that you choose depends on the following factors:
v The features in Table 9-1 of which you want to take advantage
v Whether or not your queries tend to scan the entire table
v Whether or not you know the distribution of data to be added
v Whether or not your applications tend to delete many rows
Basically, the round-robin scheme provides the easiest and surest way of balancing
data. However, with round-robin distribution, you have no information about the
fragment in which a row is located, and the database server cannot eliminate
fragments.
In general, round-robin is the correct choice only when all the following conditions
apply:
v Your queries tend to scan the entire table.
v You do not know the distribution of data to be added.
v Your applications tend not to delete many rows. (If they do, load balancing can
be degraded.)
An expression-based scheme might be the best choice to fragment the data if any
of the following conditions apply:
v Your application calls for numerous decision-support queries that scan specific
portions of the table.
v You know what the data distribution is.
v You plan to cycle data through a database.
If you plan to add and delete large amounts of data periodically, based on the
value of a column such as date, you can use that column in the distribution
scheme. You can then use the alter fragment attach and alter fragment detach
statements to cycle the data through the table.
The ALTER FRAGMENT ATTACH and DETACH statements provide the following
advantages over bulk loads and deletes:
v The rest of the table fragments are available for other users to access. Only the
fragment that you attach or detach is not available to other users.
v With the performance enhancements, the execution of an ALTER FRAGMENT
ATTACH or DETACH statement is much faster than a bulk load or mass delete.
For more information, see “Improve the performance of operations that attach and
detach fragments” on page 9-19.
To obtain this information, run the UPDATE STATISTICS statement for the table
and then use the dbschema utility to examine the distribution.
After you know the data distribution, you can design a fragmentation rule that
distributes data across fragments as required to meet your fragmentation goal. If
your primary goal is to improve performance, your fragment expression should
generate an even distribution of rows across fragments.
Try not to use columns that are subject to frequent updates in the distribution
expression. Such updates can cause rows to move from one fragment to another
(that is, be deleted from one and added to another), and this activity increases
CPU and I/O overhead.
The following suggestions are guidelines for fragmenting tables and indexes:
v For optimal performance in decision-support queries, fragment the table to
increase parallelism, but do not fragment the indexes. Detach the indexes, and
place them in a separate dbspace.
v For best performance in OLTP queries, use fragmented indexes to reduce
contention between sessions. You can often fragment an index by its key value,
which means the OLTP query only has to look at one fragment to find the
location of the row.
If the key value does not reduce contention, as when every user looks at the
same set of key values (for instance, a date range), consider fragmenting the
index on another value used in the WHERE clause. To cut down on fragment
administration, consider not fragmenting some indexes, especially if you cannot
find a good fragmentation expression to reduce contention.
v Use round-robin fragmentation on data when the table is read sequentially by
decision-support queries. Round-robin fragmentation is a good method for
spreading data evenly across disks when no column in the table can be used for
an expression-based fragmentation scheme. However, in most DSS queries, all
fragments are read.
v To reduce the total number of required dbspaces and decrease the time needed
for searches, you can store multiple named fragments within the same dbspace.
v If you are using expressions, create them so that I/O requests, rather than
quantities of data, are balanced across disks. For example, if the majority of your
queries access only a portion of data in the table, set up your fragmentation
expression to spread active portions of the table across disks, even if this
arrangement results in an uneven distribution of rows.
v Keep fragmentation expressions simple. Fragmentation expressions can be as
complex as you want. However, complex expressions take more time to evaluate
and might prevent fragments from being eliminated from queries.
v Arrange fragmentation expressions so that the most restrictive condition for each
dbspace is tested within the expression first. When the database server tests a
value against the criteria for a given fragment, evaluation stops when a
Each index of a fragmented table occupies its own tblspace with its own extents.
You can fragment the index with either of the following strategies:
v Same fragmentation strategy as the table
v Different fragmentation strategy from the table
Attached indexes
An attached index is an index that implicitly follows the table fragmentation
strategy (distribution scheme and set of dbspaces in which the fragments are
located). The database server automatically creates an attached index when you
first fragment a table.
To create an attached index with partitions, include the partition name in your SQL
statements, as shown in this example:
CREATE TABLE tb1(a int)
FRAGMENT BY EXPRESSION
PARTITION part1 (a >=0 AND a < 5) IN dbs1,
PARTITION part2 (a >=5 AND a < 10) IN dbs1
...
;
Use ALTER FRAGMENT syntax to change fragmented indexes that do not have
partitions into indexes that have partitions. The syntax below shows how you
might convert a fragmented index into an index that contains partitions:
CREATE TABLE t1 (c1 int) FRAGMENT BY EXPRESSION
(c1=10) IN dbs1, (c1=20) IN dbs2, (c1=30) IN dbs3
CREATE INDEX ind1 ON t1 (c1) FRAGMENT BY EXPRESSION
(c1=10) IN dbs1, (c1=20) IN dbs2, (c1=30) IN dbs3
ALTER FRAGMENT ON INDEX ind1 INIT FRAGMENT BY EXPRESSION
PARTITION part_1 (c1=10) IN dbs1, PARTITION part_2 (c1=20) IN dbs1,
PARTITION part_3 (c1=30) IN dbs1,
The database server fragments the attached index according to the same
distribution scheme as the table by using the same rule for index keys as for table
data. As a result, attached indexes have the following physical characteristics:
v The number of index fragments is the same as the number of data fragments.
v Each attached index fragment resides in the same dbspace as the corresponding
table data, but in a separate tblspace.
v An attached index or an index on a nonfragmented table uses 4 bytes for the
row pointer for each index entry. For more information about how to estimate
space for an index, see “Estimating index pages” on page 7-4.
Detached indexes
A detached index is an index with a separate fragmentation strategy that you set up
explicitly with the CREATE INDEX statement.
By default, all new indexes that the CREATE INDEX statement creates are
detached and stored in separate tablespaces from the data unless the deprecated
IN TABLE syntax is specified.
To create a detached index with partitions, include the partition name in your SQL
statements, as shown in this example:
CREATE TABLE tb1 (a int)
FRAGMENT BY EXPRESSION
PARTITION part1 (a <= 10) IN dbs1,
PARTITION part2 (a <= 20) IN dbs2,
PARTITION part3 (a <= 30) IN dbs3;
You can use the PARTITION BY EXPRESSION keywords instead of the FRAGMENT BY
EXPRESSION keywords in the CREATE TABLE, CREATE INDEX, and ALTER
FRAGMENT ON INDEX statements.
If you do not want to fragment the index, you can put the entire index in a
separate dbspace.
You can fragment the index for any table by expression. However, you cannot
explicitly create a round-robin fragmentation scheme for an index. Whenever you
fragment a table using a round-robin fragmentation scheme, convert all indexes
that accompany the table to detached indexes for the best performance.
Forest of trees indexes are detached indexes. They cannot be attached indexes.
The database server stores the location of each table and index fragment, along
with other related information, in the sysfragments system catalog table. You can
view the sysfragments system catalog table to access information about
fragmented tables and indexes, including the following :
v The value in the partn column is the partition number or fragment id of the
table or index fragment. The partition number for a detached index is different
from the partition number of the corresponding table fragment.
v The value in the strategy column is the distribution scheme used in the
fragmentation strategy.
For a complete description of column values that the sysfragments system catalog
table contains, see the IBM Informix Guide to SQL: Reference. For information about
how to use sysfragments to monitor your fragments, see “Monitoring fragment
use” on page 9-28.
You can create a temporary, fragmented table with the TEMP TABLE clause of the
CREATE TABLE statement. However, you cannot alter the fragmentation strategy
You can define your own fragmentation strategy for an explicit temporary table, or
you can let the database server dynamically determine the fragmentation strategy.
For more information about explicit and implicit temporary tables, see your IBM
Informix Administrator's Guide.
Fragment elimination improves both response time for a given query and
concurrency between queries. Because the database server does not need to read in
unnecessary fragments, I/O for a query is reduced. Activity in the LRU queues is
also reduced.
If you use an appropriate distribution scheme, the database server can eliminate
fragments from the following database operations:
v The fetch portion of the SELECT, INSERT, delete or update statements in SQL
The database server can eliminate fragments when these SQL statements are
optimized, before the actual search.
v Nested-loop joins
When the database server obtains the key value from the outer table, it can
eliminate fragments to search on the inner table.
Whether the database server can eliminate fragments from a search depends on
two factors:
v The distribution scheme in the fragmentation strategy of the table that is being
searched
v The form of the query expression (the expression in the WHERE clause of a
SELECT, INSERT, delete or update statement)
When the fragmentation strategy is defined with any of the following operators,
fragment elimination can occur for a query on the table.
IN
=
<
>
<=
>=
AND
OR
NOT
IS NULL (only when not combined with other expressions using AND or OR operators)
The database server considers two types of simple expressions for fragment
elimination, based on the operator:
v Range expressions
v Equality expressions
If the range expression contains MATCH or LIKE, the database server can also
eliminate fragments if the string does not begin with a wildcard character. The
following examples show query expressions that can take advantage of fragment
elimination:
columna MATCH "ab*"
columna LIKE "ab%" OR columnb LIKE "ab*"
The database server can also eliminate fragments when these equality expressions
are combined with the following operators:
AND, OR
This table shows that the distribution schemes enable fragment elimination, but the
effectiveness of fragment elimination is determined by the WHERE clause of the
specified query.
However, the database server can eliminate the fragment in dbspace dbsp3 if the
WHERE clause has the following expression:
column_b = -50
Furthermore, the database server can eliminate the two fragments in dbspaces
dbsp2 and dbsp3 if the WHERE clause has the following expression:
column_a = 5 AND column_b = -50
The advantage of this type of distribution scheme is that the database server can
eliminate fragments for queries with range expressions as well as queries with
equality expressions. You should meet these conditions when you design your
fragmentation rule. Figure 9-1 gives an example of this type of fragmentation rule.
...
FRAGMENT BY EXPRESSION
a<=8 OR a IN (9,10) IN dbsp1,
10<a AND a<=20 IN dbsp2,
a IN (21,22,23) IN dbsp3,
a>23 IN dbsp4;
You can create nonoverlapping fragments using a range rule or an arbitrary rule
based on a single column. You can use relational operators, as well as AND, IN,
OR, or BETWEEN. Be careful when you use the BETWEEN operator. When the
database server parses the BETWEEN keyword, it includes the end points that you
specify in the range of values. Avoid using a REMAINDER clause in your
expression. If you use a REMAINDER clause, the database server cannot always
eliminate the remainder fragment.
The only restriction for this category of fragmentation rule is that you base the
fragmentation rule on a single column.
Figure 9-2 on page 9-18 shows an example of this type of fragmentation rule.
If you use this type of distribution scheme, the database server can eliminate
fragments on an equality search but not a range search. This distribution scheme
can still be useful because all INSERT and many UPDATE operations perform
equality searches.
...
FRAGMENT BY EXPRESSION
0<a AND a<=10 AND b IN ('E’, 'F’, 'G’) IN dbsp1,
0<a AND a<=10 AND b IN ('H’, 'I’, 'J’) IN dbsp2,
10<a AND a<=20 AND b IN ('E’, 'F’, 'G’) IN dbsp3,
10<a AND a<=20 AND b IN ('H’, 'I’, 'J’) IN dbsp4,
20<a AND a<=30 AND b IN ('E’, 'F’, 'G’) IN dbsp5,
20<a AND a<=30 AND b IN ('H’, 'I’, 'J’) IN dbsp6;
b IN ('H', 'I','J')
b IN ('E', 'F','G')
column a
0 < a <= 10 10 < a <= 20 20 < a <= 30
If you use this type of distribution scheme, the database server can eliminate
fragments on an equality search but not a range search. This capability can still be
useful because all INSERT operations and many UPDATE operations perform
equality searches. Avoid using a REMAINDER clause in the expression. If you use
a REMAINDER clause, the database server cannot always eliminate the remainder
fragment.
To take advantage of the performance optimization, you must meet all of the
following requirements:
v Formulate appropriate distribution schemes for your table and index fragments.
v Ensure that no data movement occurs between the resultant partitions due to
fragment expressions.
v Update statistics for all the participating tables.
v Make the indexes on the attached tables unique if the index on the surviving
table is unique.
You fragment an index in the same way as the table when you create an index
without specifying a fragmentation strategy.
Suppose you create a fragmented table and index with the following SQL
statements:
Suppose you then create another table that is not fragmented, and you
subsequently decide to attach it to the fragmented table.
CREATE TABLE tb2 (a int, CHECK (a >=10 AND a<15))
IN db3;
This attach operation can take advantage of the existing index idx2 if no data
movement occurs between the existing and the new table fragments. If no data
movement occurs:
v The database server reuses index idx2 and converts it to a fragment of index
idx1.
v The index idx1 remains as an index with the same fragmentation strategy as the
table tb1.
If the database server discovers that one or more rows in the table tb2 belong to
preexisting fragments of the table tb1, the database server:
v Drops and rebuilds the index idx1 to include the rows that were originally in
tables tb1 and tb2
v Drops the index idx2
For more information about how to ensure no data movement between the existing
and the new table fragments, see “Ensuring no data movement when you attach a
fragment” on page 9-23.
Fragmenting the index with the same distribution scheme as the table:
You fragment an index with the same distribution scheme as the table when you
create an index that uses the same fragment expressions as the table.
The database server determines if the fragment expressions are identical, based on
the equivalency of the expression tree instead of the algebraic equivalence. For
example, consider the following two expressions:
(col1 >= 5)
(col1 = 5 OR col1 > 5)
Although these two expressions are algebraically equivalent, they are not identical
expressions.
Example of Fragmenting the Index with the Same Distribution Scheme as the
Table
Suppose you create two fragmented tables and indexes with the following SQL
statements:
Suppose you then attach table tb2 to table tb1 with the following sample SQL
statement:
ALTER FRAGMENT ON TABLE tb1
ATTACH tb2 AS (a <= 40);
The database server can eliminate the rebuild of index idx1 for this attach
operation for the following reasons:
v The fragmentation expression for index idx1 is identical to the fragmentation
expression for table tb1. The database server:
– Expands the fragmentation of the index idx1 to the dbspace idxdbspc4
– Converts index idx2 to a fragment of index idx1
v No rows move from one fragment to another because the CHECK constraint is
identical to the resulting fragmentation expression of the attached table.
For more information about how to ensure no data movement between the
existing and the new table fragments, see “Ensuring no data movement when
you attach a fragment” on page 9-23.
You can take advantage of the performance benefits of the ALTER FRAGMENT
ATTACH operation when you combine two unfragmented tables into one
fragmented table.
For example, suppose you create two unfragmented tables and indexes with the
following SQL statements:
CREATE TABLE tb1(a int) IN db1;
CREATE INDEX idx1 ON tb1(a) in db1;
CREATE TABLE tb2(a int) IN db2;
CREATE INDEX idx2 ON tb2(a) in db2;
You might want to combine these two unfragmented tables with the following
sample distribution scheme:
ALTER FRAGMENT ON TABLE tb1
ATTACH
tb1 AS (a <= 100),
tb2 AS (a > 100);
If no data migrates between the fragments of tb1 and tb2, the database server
redefines index idx1 with the following fragmentation strategy:
Important: This behavior results in a different fragmentation strategy for the index
prior to Version 7.3 and Version 9.2 of the database server. In earlier versions, the
ALTER FRAGMENT ATTACH statement creates an unfragmented detached index
in the dbspace db1.
For example, you might create a fragmented table and index with the following
SQL statements:
CREATE TABLE tb1(a int)
FRAGMENT BY EXPRESSION
(a >=0 AND a < 5) IN db1,
(a >=5 AND a <10) IN db2;
Suppose you create another table that is not fragmented, and you subsequently
decide to attach it to the fragmented table.
CREATE TABLE tb2 (a int, check (a >=10 and a<15))
IN db3;
This ALTER FRAGMENT ATTACH operation takes advantage of the existing index
idx2 because the following steps were performed in the example to prevent data
movement between the existing and the new table fragment:
v The check constraint expression in the CREATE TABLE tb2 statement is identical
to the fragment expression for table tb2 in the ALTER FRAGMENT ATTACH
statement.
v The fragment expressions specified in the CREATE TABLE tb1 and the ALTER
FRAGMENT ATTACH statements are not overlapping.
Therefore, the database server preserves index idx2 in dbspace db3 and converts it
into a fragment of index idx1. The index idx1 remains as an index with the same
fragmentation strategy as the table tb1.
Informix estimates the cost to create the whole index on the resultant table. The
server then compares this cost to the cost of building the individual index
fragments for the attached tables and chooses the index build with the least cost.
When the CREATE INDEX statement runs successfully, with or without the
ONLINE keyword, Informix automatically gathers the following statistics for the
newly created index:
v Index-level statistics, equivalent to the statistics gathered in the UPDATE
STATISTICS operation in LOW mode, for all types of indexes, including B-tree,
Virtual Index Interface, and functional indexes.
v Column-distribution statistics, equivalent to the distribution generated in the
UPDATE STATISTICS operation in HIGH mode, for a non-opaque leading
indexed column of an ordinary B-tree index. The resolution of the HIGH mode
is 1.0 for a table size that is less than 1 million rows and 0.5 for higher table
sizes. Tables with more than 1 million rows have a better resolution because they
have more bins for statistics.
To ensure that cost estimates are correct, you should execute the UPDATE
STATISTICS statement on all of the participating tables before you attach the
tables. The LOW mode of the UPDATE STATISTICS statement is sufficient to
derive the appropriate statistics for the optimizer to determine cost estimates for
rebuilding indexes.
For more information about using the UPDATE STATISTICS statement, see the
IBM Informix Guide to SQL: Syntax.
When a table does not have an index on a column that can serve as the fragment
of the resultant index, the database server estimates the cost of building the index
fragment for the column, compares this cost to rebuilding the entire index for all
fragments on the resultant table, and chooses the index build with the least cost.
Suppose you create a fragmented table and index with the following SQL
statements:
CREATE TABLE tb1(a int, b int)
FRAGMENT BY EXPRESSION
(a >=0 AND a < 5) IN db1,
(a >=5 AND a <10) IN db2;
CREATE INDEX idx1 ON tb1(a);
Suppose you then create two more tables that are not fragmented, and you
subsequently decide to attach them to the fragmented table.
The only time the UPDATE STATISTICS LOW FOR TABLE statement is required is after
a CREATE INDEX statement in a situation in which the table has other preexisting
indexes, as shown in this example:
CREATE TABLE tb1(col1 int, col2 int);
CREATE INDEX index idx1 on tb1(col1);
(equivalent to update stats low on table tb1)
LOAD from tb1.unl insert into tb1; (load some data)
CREATE INDEX idx2 on tb1(col2);
In the preceding example, table tb3 does not have an index on column a that can
serve as the fragment of the resultant index idx1. The database server estimates the
cost of building the index fragment for column a on the consumed table tb3 and
compares this cost to rebuilding the entire index for all fragments on the resultant
table. The database server chooses the index build with the least cost.
When the index on a table is not usable, the database server estimates the cost of
building the index fragment, compares this cost to rebuilding the entire index for
all fragments on the resultant table, and chooses the index build with the least
cost.
Suppose you create tables and indexes as in the previous section, but the index on
the third table specifies a dbspace that the first table also uses. The following SQL
statements show this scenario:
CREATE TABLE tb1(a int, b int)
FRAGMENT BY EXPRESSION
(a >=0 AND a < 5) IN db1,
(a >=5 AND a <10) IN db2;
CREATE INDEX idx1 ON tb1(a);
CREATE TABLE tb2 (a int, b int, check (a >=10 and a<15))
IN db3;
CREATE INDEX idx2 ON tb2(a)
IN db3;
Again, the database server estimates the cost of building the index fragment for
column a on the consumed table tb3 and compares this cost to rebuilding the
entire index idx1 for all fragments on the resultant table. Then the database server
chooses the index build with the least cost.
For example, suppose you create a fragmented table and index with the following
SQL statements:
CREATE TABLE tb1(a int)
FRAGMENT BY EXPRESSION
(a >=0 AND a < 5) IN db1,
(a >=5 AND a <10) IN db2,
(a >=10 AND a <15) IN db3;
CREATE INDEX idx1 ON tb1(a);
The database server fragments the index keys into dbspaces db1, db2, and db3
with the same column a value ranges as the table because the CREATE INDEX
statement does not specify a fragmentation strategy.
Suppose you then decide to detach the data in the third fragment with the
following SQL statement:
ALTER FRAGMENT ON TABLE tb1
DETACH db3 tb3;
Because the fragmentation strategy of the index is the same as the table, the
ALTER FRAGMENT DETACH statement does not rebuild the index after the
detach operation. The database server drops the fragment of the index in dbspace
db3, updates the system catalog tables, and eliminates the index build.
For example, suppose you create a fragmented table and index with the following
SQL statements:
CREATE TABLE tb1(a int, b int)
FRAGMENT BY EXPRESSION
(a >=0 AND a < 5) IN db1,
(a >=5 AND a <10) IN db2,
(a >=10 AND a <15) IN db3;
Suppose you then decide to detach the data in the third fragment with the
following SQL statement:
ALTER FRAGMENT ON TABLE tb1
DETACH db3 tb3;
Because the distribution scheme of the index is the same as the table, the ALTER
FRAGMENT DETACH statement does not rebuild the index after the detach
operation. The database server drops the fragment of the index in dbspace db3,
updates the system catalog tables, and eliminates the index build.
Be aware that if you enable the server to force out the transactions, the server will
roll back other users' transactions. The server also closes the hold cursors during
rollback by the session that performs the ALTER FRAGMENT ON TABLE
operation.
Prerequisites:
v You must be user informix or have DBA privileges on the database.
v The table must be in a logging database.
The onstat -g ppf output includes the number of read-and-write requests sent to
each fragment that is currently open. Because a request can trigger multiple I/O
operations, these requests do not indicate how many individual disk I/O
operations occur, but you can get a good idea of the I/O activity from the
displayed columns.
The brfd column in the output displays the number of buffer reads in pages. (Each
buffer can contain one page.) This information is useful if you need to monitor the
time a query takes to execute. Typically query execution time has a strong
The onstat -g ppf output by itself does not identify the table in which a fragment
is located. To determine the table for the fragment, join the partnum column in the
output to the partnum column in the sysfragments system catalog table. The
sysfragments table displays the associated table id. You can also find the table
name for the fragment by joining the table id column in sysfragments to the table
id column in systables.
The SET EXPLAIN output identifies the fragments with a fragment number. The
fragment numbers are the same as those contained in the partn column in the
sysfragments system catalog table.
The following example of partial SET EXPLAIN output shows a query that takes
advantage of fragment elimination and scans two fragments in table t1:
QUERY:
------
SELECT * FROM t1 WHERE c1 > 12
Estimated Cost: 3
Estimated # of Rows Returned: 2
If the optimizer must scan all fragments (that is, if it is unable to eliminate any
fragment from consideration), the SET EXPLAIN output displays fragments: ALL. In
addition, if the optimizer eliminates all the fragments from consideration (that is,
none of the fragments contain the queried information), the SET EXPLAIN output
displays fragments: NONE.
For information about how the database server eliminates a fragment from
consideration, see “Distribution schemes that eliminate fragments” on page 9-14.
For more information about the SET EXPLAIN ON statement, see “Report that
shows the query plan chosen by the optimizer” on page 10-9.
The parallel database query (PDQ) features in the database server provide the
largest potential performance improvements for a query. Chapter 12, “Parallel
database query (PDQ),” on page 12-1 describes PDQ and the Memory Grant
Manager (MGM) and explains how to control resource use by queries.
PDQ provides the most substantial performance gains if you fragment your tables
as described in Chapter 9, “Fragmentation guidelines,” on page 9-1.
Chapter 13, “Improving individual query performance,” on page 13-1 explains how
to improve the performance of specific queries.
For example, when evaluating the different ways in which a query might be
performed, the optimizer must determine whether indexes should be used. If the
query includes a join, the optimizer must determine the join plan (hash or nested
loop) and the order in which tables are evaluated or joined.
The following topics describe the components of a query plan and show examples
of query plans.
The optimizer can also choose to access the table by an index. If the column in the
index is the same as a column in a filter of the query, the optimizer can use the
index to retrieve only the rows that the query requires. The optimizer can use a
key-only index scan if the columns requested are within one index on the table. The
database server retrieves the needed data from the index and does not access the
associated table.
The optimizer compares the cost of each plan to determine the best one. The
database server derives cost from estimates of the number of I/O operations
required, calculations to produce the results, rows accessed, sorting, and so forth.
In the following query, the customer and orders table are joined by the
customer.customer_num = orders.customer_num filter:
SELECT * from customer, orders
WHERE customer.customer_num = orders.customer_num
AND customer.lname = "Higgins";
Because of the nature of hash joins, an application with isolation level set to
Repeatable Read might temporarily lock all the records in tables that are involved
in the join, including records that fail to qualify the join. This situation leads to
decreased concurrency among connections. Conversely, nested-loop joins lock
fewer records but provide reduced performance when a large number of rows are
accessed. Thus, each join method has advantages and disadvantages.
Nested-loop join
In a nested-loop join, the database server scans the first, or outer table, and then
joins each of the rows that pass table filters to the rows found in the second, or
inner table.
Figure 10-1 on page 10-3 shows tables and rows, and the order they are read, for
query:
SELECT * FROM customer, orders
WHERE customer.customer_num=orders.customer_num
AND order_date>"01/01/2007";
The database server accesses an outer table by an index or by a table scan. The
database server applies any table filters first. For each row that satisfies the filters
on the outer table, the database server reads the inner table to find a match.
The database server reads the inner table once for every row in the outer table that
fulfills the table filters. Because of the potentially large number of times that the
inner table can be read, the database server usually accesses the inner table by an
index.
If the inner table does not have an index, the database server might construct an
autoindex at the time of query execution. The optimizer might determine that the
cost to construct an autoindex at the time of query execution is less than the cost to
scan the inner table for each qualifying row in the outer table.
Hash join
The optimizer usually uses a hash join when at least one of the two join tables
does not have an index on the join column or when the database server must read
a large number of rows from both tables. No index and no sorting is required
when the database server performs a hash join.
A hash join consists of two activities: first building the hash table (build phase) and
then probing the hash table (probe phase). Figure 10-2 on page 10-4 shows the hash
join in detail.
In the build phase, the database server reads one table and, after it applies any
filters, creates a hash table. Think of a hash table conceptually as a series of buckets,
each with an address that is derived from the key value by applying a hash
function. The database server does not sort keys in a particular hash bucket.
Smaller hash tables can fit in the virtual portion of database server shared memory.
The database server stores larger hash files on disk in the dbspace specified by the
DBSPACETEMP configuration parameter or the DBSPACETEMP environment
variable.
In the probe phase, the database server reads the other table in the join and applies
any filters. For each row that satisfies the filters on the table, the database server
applies the hash function on the key and probes the hash table to find a match.
Hash table
bucket rows
3978 6692
4588 6693
Join order
The order that tables are joined in a query is extremely important. A poor join
order can cause query performance to decline noticeably.
For an example of how the database server executes a plan according to a specific
join order, see “Example of query-plan execution.”
Assume also that no indexes are on any of the three tables. Suppose that the
optimizer chooses the customer-orders-items path and the nested-loop join for
both joins (in reality, the optimizer usually chooses a hash join for two tables
without indexes on the join columns). Figure 10-3 on page 10-5 shows the query
plan, expressed in pseudocode. For information about interpreting query plan
information, see “Report that shows the query plan chosen by the optimizer” on
page 10-9.
This example does not describe the only possible query plan. Another plan merely
reverses the roles of customer and orders: for each row of orders, it reads all rows
of customer, looking for a matching customer_num. It reads the same number of
rows in a different order and produces the same set of rows in a different order. In
this example, no difference exists in the amount of work that the two possible
query plans need to do.
The expression O.paid_date IS NULL filters out some rows, reducing the number
of rows that are used from the orders table. Consider a plan that starts by reading
from orders. Figure 10-4 on page 10-6 displays this sample plan in pseudocode.
Let pdnull represent the number of rows in orders that pass the filter. It is the value
of COUNT(*) that results from the following query:
SELECT COUNT(*) FROM orders WHERE paid_date IS NULL
If one customer exists for every order, the plan in Figure 10-4 reads the following
rows:
v All rows of the orders table once
v All rows of the customer table, pdnull times
v All rows of the items table, pdnull times
Figure 10-5 shows an alternative execution plan that reads from the customer table
first.
Because the filter is not applied in the first step that Figure 10-5 shows, this plan
reads the following rows:
v All rows of the customer table once
v All rows of the orders table once for every row of customer
v All rows of the items table, pdnull times
The query plans in Figure 10-4 and Figure 10-5 produce the same output in a
different sequence. They differ in that one reads a table pdnull times, and the other
This topic shows the outline of a query plan that differs from query shown in
“Example of a join with column filters” on page 10-5, because it is constructed
using indexes.
The keys in an index are sorted so that when the database server finds the first
matching entry, it can read any other rows with identical keys without further
searching, because they are located in physically adjacent positions. This query
plan reads only the following rows:
v All rows of the customer table once
v All rows of the orders table once (because each order is associated with only one
customer)
v Only rows in the items table that match pdnull rows from the customer-orders
pairs
This query plan achieves a great reduction in cost compared with plans that do not
use indexes. An inverse plan, reading orders first and looking up rows in the
customer table by its index, is also feasible by the same reasoning.
The physical order of rows in a table also affects the cost of index use. To the
degree that a table is ordered relative to an index, the overhead of accessing
multiple table rows in index order is reduced. For example, if the orders table
rows are physically ordered according to the customer number, multiple retrievals
of orders for a given customer would proceed more rapidly than if the table were
ordered randomly.
In some cases, using an index might incur additional costs. For more information,
see “Index lookup costs” on page 10-26.
The union of small index scans results in an access path that uses only subsets of
the full range of a composite index. The table is logically joined to itself, and the
more selective non-leading index keys are applied as index-bound filters to each
unique combination of the leading key values.
The query in Figure 10-7 shows the SET EXPLAIN output for a query plan that
includes an index self-join path.
QUERY:
------
SELECT a.c1,a.c2,a.c3 FROM tab1 a WHERE (a.c3 >= 100103) AND
(a.c3 <= 100104) AND (a.c1 >= ’PICKED ’) AND
(a.c1 <= ’RGA2 ’) AND (a.c2 >= 1) AND (a.c2 <= 7)
ORDER BY 1, 2, 3
Figure 10-7. SET EXPLAIN output for a query with an index self-join path
In Figure 10-7, an index exists on columns c1, c2, c3, c4, and c5. The optimizer
chooses c1 and c2 as lead keys, which implies that columns c1 and c2 have many
duplicates. Column c3 has few duplicates and thus the predicates on column c3 (c3
>= 100103 and c3 <= 100104) have good selectivity.
As Figure 10-7 shows, an index self-join path is a self-join of two index scans using
the same index. The first index scan retrieves each unique value for lead key
columns, which are c1 and c2. The unique value of c1 and c2 is then used to probe
the second index scan, which also uses predicates on column c3. Because
predicates on column c3 have good selectivity:
v The index scan on the inner side of the nested-loop join is very efficient,
retrieving only the few rows that satisfy the c3 predicates.
v The index scan does not retrieve extra rows.
Thus, for each unique value of c1 and c2, an efficient index scan on c1, c2 and c3
occurs.
The example shows the bounds for columns c1 and c2, which you can conceive of
as the bounds for the index scan to retrieve the qualified leading keys of the index.
This information represents the inner index scan. For lead key columns c1 and c2
the self- join predicate is used, indicating the value of c1 and c2 comes from the
outer index scan. The predicates on column c3 serve as an index filter that makes
the inner index scan efficient.
Regular index scans do not use filters on column c3 to position the index scan,
because the lead key columns c1 and c2 do not have equality predicates.
The INDEX_SJ directive forces an index self-join path using the specified index, or
choosing the least costly index in a list of indexes, even if data distribution
statistics are not available for the leading index key columns. The
AVOID_INDEX_SJ directive prevents a self-join path for the specified index or
indexes. Also see “Access-method directives” on page 11-4 and the IBM Informix
Guide to SQL: Syntax.
For example, if neither join specifies an ordered set of rows by using the ORDER
BY or GROUP BY clauses of the SELECT statement, the join pair (A x B) is
redundant with respect to (B x A).
If the query uses additional tables, the optimizer joins each remaining pair to a
new table to form all possible join triplets, eliminating the more expensive of
redundant triplets and so on for each additional table to be joined. When a
non-redundant set of possible join combinations has been generated, the optimizer
selects the plan that appears to have the lowest execution cost.
If a user does not have any access to SQL code source, the Database Administrator
can set dynamically the SET EXPLAIN using the onmode -Y command.
After the database server executes the SET EXPLAIN ON statement or sets
dynamically the SET EXPLAIN with onmode -Y command, the server writes an
explanation of each query plan to a file for subsequent queries that the user enters.
Related concepts:
“The explain output file”
“Query statistics section provides performance debugging information” on page
10-11
Using the FILE TO option (SQL Syntax)
Default name and location of the explain output file on UNIX (SQL Syntax)
Default name and location of the output file on Windows (SQL Syntax)
“Report that shows the query plan chosen by the optimizer” on page 10-9
“Enabling external directives” on page 11-15
Related reference:
SET EXPLAIN statement (SQL Syntax)
onmode -Y: Dynamically change SET EXPLAIN (Administrator's Reference)
onmode and Y arguments: Change query plan measurements for a session
(SQL administration API) (Administrator's Reference)
When you run the onmode -Y command to turn on dynamic SET EXPLAIN, the
output is displayed in a new explain output file. If a file from the SET EXPLAIN
statement exists, the database server stops using it, and instead uses the file
created by onmode -Y until the administrator turns off dynamic SET EXPLAIN for
the session.
The following codes in the Query Statistics section of the explain output file
provide information about external tables:
v xlcnv identifies an operation that is loading data from an external table and
inserting the data into a base table. Here x = external table, l = loading, and cnv
= converter
The Query Statistics section of the explain output file is a useful resource for
debugging performance problems. See “Query statistics section provides
performance debugging information.”
Related concepts:
Using the FILE TO option (SQL Syntax)
Default name and location of the explain output file on UNIX (SQL Syntax)
Default name and location of the output file on Windows (SQL Syntax)
“Report that shows the query plan chosen by the optimizer” on page 10-9
“Query statistics section provides performance debugging information”
“Enabling external directives” on page 11-15
Related reference:
SET EXPLAIN statement (SQL Syntax)
onmode -Y: Dynamically change SET EXPLAIN (Administrator's Reference)
onmode and Y arguments: Change query plan measurements for a session
(SQL administration API) (Administrator's Reference)
The Query Statistics section of the explain output file shows the estimated number
of rows that the query plan expects to return, the actual number of returned rows,
and other information about the query. You can use this information, which
provides an indication of the overall flow of the query plan and how many rows
flow through each stage of the query, to debug performance problems.
The following example shows query statistics in SET EXPLAIN output. If the
estimated and actual number of rows scanned or joined are quite different, the
statistics on those tables might be old and should be updated.
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tab2
t2 tab1
Related concepts:
“The explain output file” on page 10-10
Using the FILE TO option (SQL Syntax)
Default name and location of the explain output file on UNIX (SQL Syntax)
Default name and location of the output file on Windows (SQL Syntax)
“Report that shows the query plan chosen by the optimizer” on page 10-9
“Sample query plan reports”
“Enabling external directives” on page 11-15
Related reference:
SET EXPLAIN statement (SQL Syntax)
onmode -Y: Dynamically change SET EXPLAIN (Administrator's Reference)
onmode and Y arguments: Change query plan measurements for a session
(SQL administration API) (Administrator's Reference)
Single-table query
This topic shows sample SET EXPLAIN output for a simple query and a complex
query on a single table.
QUERY:
------
SELECT fname, lname, company FROM customer
Estimated Cost: 2
Estimated # of Rows Returned: 28
Figure 10-10 shows SET EXPLAIN output for a complex query on the customer
table.
QUERY:
------
SELECT fname, lname, company FROM customer
WHERE company MATCHES ’Sport*’ AND
customer_num BETWEEN 110 AND 115
ORDER BY lname
Estimated Cost: 1
Estimated # of Rows Returned: 1
Temporary Files Required For: Order By
The following output lines in Figure 10-10 show the scope of the index scan for the
second query:
v Lower Index Filter: virginia.customer.customer_num >= 110
Start the index scan with the index key value of 110.
v Upper Index Filter: virginia.customer.customer_num <= 115
Stop the index scan with the index key value of 115.
Multitable query
This topic shows sample SET EXPLAIN output for a multiple-table query.
Estimated Cost: 78
Estimated # of Rows Returned: 1
Temporary Files Required For: Group By
The SET EXPLAIN output lists the order in which the database server accesses the
tables and the access plan to read each table. The plan in Figure 10-11 indicates
that the database server is to perform the following actions:
1. The database server is to read the orders table first.
Because no filter exists on the orders table, the database server must read all
rows. Reading the table in physical order is the least expensive approach.
2. For each row of orders, the database server is to search for matching rows in
the customer table.
The search uses the index on customer_num. The notation Key-Only means that
only the index need be read for the customer table because only the
c.customer_num column is used in the join and the output, and the column is
an index key.
3. For each row of orders that has a matching customer_num, the database server
is to search for a match in the items table using the index on order_num.
Key-first scan
This topic shows a sample query that uses a key-first scan, which is an index scan
that uses keys other than those listed as lower and upper index filters.
select * from tab1 where (c1 > 0) and ( (c2 = 1) or (c2 = 2))
Estimated Cost: 4
Estimated # of Rows Returned: 1
Even though in this example the database server must eventually read the row
data to return the query results, it attempts to reduce the number of possible rows
by applying additional key filters first. The database server uses the index to apply
the additional filter, c2 = 1 OR c2 = 2, before it reads the row data.
For example, Figure 10-13 sample output of the SET EXPLAIN ON statement
shows that the optimizer changes the table in the subquery to be the inner table in
a join.
QUERY:
------
SELECT company, fname, lname, phone
FROM customer c
WHERE EXISTS(
SELECT customer_num FROM cust_calls u
WHERE c.customer_num = u.customer_num)
Estimated Cost: 6
Estimated # of Rows Returned: 7
For more information about the SET EXPLAIN ON statement, see “Report that
shows the query plan chosen by the optimizer” on page 10-9.
When the optimizer changes a subquery to a join, it can use several variations of
the access plan and the join plan:
v First-row scan
A first-row scan is a variation of a table scan. When the database server finds
one match, the table scan halts.
v Skip-duplicate-index scan
Although the database does not actually create a table for the collection, it
processes the data as if it were a table. Collection-derived tables allow developers
to use fewer cursors and host variables to access a collection, in some cases.
The following query creates a collection-derived table for the children column and
treats the elements of this collection as rows in a table:
SELECT name, id
FROM TABLE(MUTLISET(SELECT children
FROM parents
WHERE parents.id
= 1001)) c_table(name, id);
When completing a query, the database server performs the steps shown in this
example:
1. Scans the parent table to find the row where parents.id = 1001
This operation is listed as a SEQUENTIAL SCAN in the SET EXPLAIN output
that Figure 10-14 on page 10-17 shows.
2. Reads the value of the collection column called children.
3. Scans the single collection and returns the value of name and id to the
application.
This operation is listed as a COLLECTION SCAN in the SET EXPLAIN output
that Figure 10-14 on page 10-17 shows.
Estimated Cost: 2
Estimated # of Rows Returned: 1
You can improve the performance of collection-derived tables by using SQL to fold
derived tables in simple queries into a parent query instead of into query results
that are put into a temporary table.
The database server folds derived tables in a manner that is similar to the way the
server folds views through the IFX_FOLDVIEW configuration parameter (described
in “Enable view folding to improve query performance” on page 13-31). When the
IFX_FOLDVIEW configuration parameter is enabled, views are folded into a parent
query. The views are not folded into query results that are put into a temporary
table.
The following examples show derived tables folded into the main query.
Estimated Cost: 2
Estimated # of Rows Returned: 1
Filters:
Table Scan Filters: informix.tab1.col1 > 50
Figure 10-15. Query plan that uses a derived table folded into the parent query
select * from (select col1 from tab1 where col1 = 100) as vtab1(c1)
left join (select col1 from tab2 where col1 = 10) as vtab2(vc1)
on vtab1.c1 = vtab2.vc1
Estimated Cost: 5
Estimated # of Rows Returned: 1
ON-Filters:(informix.tab1.col1 = informix.tab2.col1
AND informix.tab2.col1 = 10 )
NESTED LOOP JOIN(LEFT OUTER JOIN)
Figure 10-16. Second query plan that uses a derived table folded into the parent query
The following example shows a complex query involving the UNION operation.
Here, a temporary table has been created.
select * from (select col1 from tab1 union select col2 from tab2 )
as vtab(vcol1) where vcol1 < 50
Estimated Cost: 4
Estimated # of Rows Returned: 1
If you plan to use IBM Data Studio to obtain Visual Explain output, you must
create and specify a default sbspace name for the SBSPACENAME configuration
parameter in your onconfig file. The EXPLAIN_SQL routine creates BLOBs in this
sbspace.
For information about using IBM Data Studio, see IBM Data Studio
documentation.
Some of the factors that the optimizer uses to determine the cost of each query
plan are:
v The number of I/O requests that are associated with each file system access
v The CPU work that is required to determine which rows meet the query
predicate
v The resources that are required to sort or group the data
v The amount of memory available for the query (specified by the
DS_TOTAL_MEMORY and DS_MAX_QUERIES parameters)
The database server starts a statistical profile of a table when the table is created,
and the profile is refreshed when you issue the UPDATE STATISTICS statement.
The query optimizer does not recalculate the profile for tables automatically. In
some cases, gathering the statistics might take longer than executing the query.
To ensure that the optimizer selects a query plan that best reflects the current state
of your tables, run UPDATE STATISTICS at regular intervals. For guidelines, see
“Update statistics when they are not generated automatically” on page 13-11.
The optimizer uses the following system catalog information as it creates a query
plan:
v The number of rows in a table, as of the most recent UPDATE STATISTICS
statement
v Whether a column is constrained to be unique
v The distribution of column values, when requested with the MEDIUM or HIGH
keyword in the UPDATE STATISTICS statement
For more information about data distributions, see “Creating data distributions”
on page 13-13.
v The number of disk pages that contain row data
The optimizer also uses the following system catalog information about indexes:
v The indexes that exist on a table, including the columns that they index, whether
they are ascending or descending, and whether they are clustered
v The depth of the index structure (a measure of the amount of work that is
needed to perform an index lookup)
v The number of disk pages that index entries occupy
v The number of unique entries in an index, which can be used to estimate the
number of rows that an equality filter returns
v Second-largest and second-smallest key values in an indexed column
Only the second-largest and second-smallest key values are noted, because the
extreme values might have a special meaning that is not related to the rest of the
data in the column. The database server assumes that key values are distributed
evenly between the second largest and second smallest. Only the initial 4 bytes of
these keys are stored. If you create a distribution for a column associated with an
index, the optimizer uses that distribution when it estimates the number of rows
that match a query.
For more information about system catalog tables, see the IBM Informix Guide to
SQL: Reference.
The selectivity is a value between 0 and 1 that indicates the proportion of rows
within the table that the filter can pass. A selective filter, one that passes few rows,
has a selectivity near 0, and a filter that passes almost all rows has a selectivity
near 1. For guidelines on filters, see “Improve filter selectivity” on page 13-2.
The optimizer can use data distributions to calculate selectivity for the filters in a
query. However, in the absence of data distributions, the database server calculates
selectivity for filters of different types based on table indexes. The following table
lists some of the selectivity values that the optimizer assigns to filters of different
types. Selectivity that is calculated using data distributions is even more accurate
than the selectivity that this table shows.
In the table:
v indexed-col is the first or only column in an index.
v 2nd-max, 2nd-min are the second-largest and second-smallest key values in
indexed column.
v The plus sign ( + ) means logical union ( = the Boolean OR operator) and the
multiplication symbol ( x ) means logical intersection ( = the Boolean AND
operator).
Table 10-1. Selectivity values that the optimizer assigns to filters of different types
Filter Expression Selectivity (F)
indexed-col = literal-valueindexed-col = F = 1/(number of distinct keys in index)
host-variableindexed-col IS NULL
tab1.indexed-col = tab2.indexed-col F = 1/(number of distinct keys in the larger index)
indexed-col > literal-value F = (2nd-max - literal-value)/(2nd-max - 2nd-min)
indexed-col < literal-value F = (literal-value - 2nd-min)/(2nd-max - 2nd-min)
any-col IS NULLany-col = any-expression F = 1/10
any-col > any-expressionany-col < any-expression F = 1/3
any-col MATCHES any-expressionany-col LIKE F = 1/5
any-expression
EXISTS subquery F = 1 if subquery estimated to return >0 rows, else 0
NOT expression F = 1 - F(expression)
expr1 AND expr2 F = F(expr1) x F(expr2)
expr1 OR expr2 F = F(expr1) + F(expr2) - (F(expr1) x F(expr2))
any-col IN list Treated as any-col = item1 OR . . . OR any-col = itemn.
any-col relop ANY subquery Treated as any-col relop value1 OR . . . OR any-col relop
valuen for estimated size of subquery n.
Here relop is any relational operator, such as <, >, >=, <=.
The optimizer can choose an index for any one of the following cases:
v When the column is indexed and a value to be compared is a literal, a host
variable, or an uncorrelated subquery
The database server can locate relevant rows in the table by first finding the row
in an appropriate index. If an appropriate index is not available, the database
server must scan each table in its entirety.
v When the column is indexed and the value to be compared is a column in
another table (a join expression)
The database server can use the index to find matching values. The following
join expression shows such an example:
WHERE customer.customer_num = orders.customer_num
If rows of customer are read first, values of customer_num can be applied to an
index on orders.customer_num.
v When processing an ORDER BY clause
If all the columns in the clause appear in the required sequence within a single
index, the database server can use the index to read the rows in their ordered
sequence, thus avoiding a sort.
v When processing a GROUP BY clause
If all the columns in the clause appear in one index, the database server can read
groups with equal keys from the index without requiring additional processing
after the rows are retrieved from their tables.
For more information, see Chapter 12, “Parallel database query (PDQ),” on page
12-1.
Single-table query
For single-table scans, when OPTCOMPIND is set to 0 or 1 and the current
transaction isolation level is Repeatable Read, the optimizer considers two types of
access plans.
If:
v An index is available, the optimizer uses it to access the table.
When OPTCOMPIND is not set in the database server configuration, its value
defaults to 2. When OPTCOMPIND is set to 2 or 1 and the current isolation level
is not Repeatable Read, the optimizer chooses the least expensive plan to access the
table.
Multitable query
For join plans, the OPTCOMPIND setting influences the access plan for a specific
ordered pair of tables.
Set OPTCOMPIND to 0 if you want the database server to select a join method
exactly as it did in previous versions of the database server. This option ensures
compatibility with previous versions.
Important: When OPTCOMPIND is set to 0, the optimizer does not choose a hash
join.
For more information about parallel queries and the DS_TOTAL_MEMORY and
DS_MAX_QUERIES parameters, see Chapter 12, “Parallel database query (PDQ),”
on page 12-1.
The following costs can be reduced by optimal query construction and appropriate
indexes:
v Sort time
v Data mismatches
v In-place ALTER TABLE
v Index lookups
For information about how to optimize specific queries, see Chapter 13,
“Improving individual query performance,” on page 13-1.
Most of these activities are performed quickly. Depending on the computer and its
workload, the database server can perform hundreds or even thousands of
comparisons each second. As a result, the time spent on in-memory work is
usually a small part of the execution time.
Sort-time costs
A sort requires in-memory work as well as disk work. The in-memory work
depends on the number of columns that are sorted, the width of the combined sort
key, and the number of row combinations that pass the query filter. You can reduce
the cost of sorting.
You can use the following formula to calculate the in-memory work that a sort
operation requires:
Wm = (c * Nfr) + (w * Nfrlog2(Nfr))
Wm is the in-memory work.
c is the number of columns to order and represents the costs to extract
column values from the row and concatenate them into a sort key.
w is proportional to the width of the combined sort key in bytes and stands
for the work to copy or compare one sort key. A numeric value for w
depends strongly on the computer hardware in use.
Nfr is the number of rows that pass the query filter.
Sorting can involve writing information temporarily to disk if the amount of data
to sort is large. You can direct the disk writes to occur in the operating-system file
space or in a dbspace that the database server manages. For details, see “Configure
dbspaces for temporary tables and sort files” on page 5-8.
The disk work depends on the number of disk pages where rows appear, the
number of rows that meet the conditions of the query predicate, the number of
rows that can be placed on a sorted page, and the number of merge operations
that must be performed. Use the following formula to calculate the disk work that
a sort operation requires:
Wd = p + (Nfr/Nrp) * 2 * (m - 1))
Wd is the disk work.
p is the number of disk pages.
Nfr is the number of rows that pass the filters.
Nrp is the number of rows that can be placed on a page.
m represents the number of levels of merge that the sort must use.
When all the keys can be held in memory, m=1 and the disk work is equivalent to
p. In other words, the rows are read and sorted in memory.
For moderate to large tables, rows are sorted in batches that fit in memory, and
then the batches are merged. When m=2, the rows are read, sorted, and written in
batches. Then the batches are read again and merged, resulting in disk work
proportional to the following value:
Wd = p + (2 * (Nfr/Nrp))
The more specific the filters, the fewer the rows that are sorted. As the number of
rows increases, and the amount of memory decreases, the amount of disk work
increases.
Row-reading costs
When the database server needs to examine a row that is not already in memory, it
must read that row from disk. The database server does not read only one row; it
reads the entire page that contains the row. If the row spans more than one page, it
reads all of the pages.
The actual cost of reading a page is variable and hard to predict. The actual cost is
a combination of the factors shown in the following table.
The time cost of reading a page can vary from microseconds for a page that is
already in a buffer, to a few milliseconds when contention is zero and the disk arm
is already in position, to hundreds of milliseconds when the page is in contention
and the disk arm is over a distant cylinder of the disk.
When the first row on a page is requested, the disk page is read into a buffer page.
After the page is read in, it does not need not to be read again; requests for
subsequent rows on that page are filled from the buffer until all the rows on that
page are processed. When one page is exhausted, the page for the next set of rows
must be read in.
When you use unbuffered devices for dbspaces, and the table is organized
properly, the disk pages of consecutive rows are placed in consecutive locations on
the disk. This arrangement allows the access arm to move very little when it reads
sequentially. In addition, latency costs are usually lower when pages are read
sequentially.
Related reference:
Read-ahead operations (Administrator's Guide)
Whenever a table is read in random order, additional disk accesses are required to
read the rows in the required order. Disk costs are higher when the rows of a table
are read in a sequence unrelated to physical order on disk. Because the pages are
not read sequentially from the disk, both seek and rotational delays occur before
each page can be read.
Nonsequential access often occurs when you use an index to locate rows. Although
index entries are sequential, there is no guarantee that rows with adjacent index
entries must reside on the same (or adjacent) data pages. In many cases, a separate
disk access must be made to fetch the page for each row located through an index.
If a table is larger than the page buffers, a page that contained a row previously
read might be cleaned (removed from the buffer and written back to the disk)
before a subsequent request for another row on that page can be processed. That
page might have to be read in again.
Depending on the relative ordering of the table with respect to the index, you can
sometimes retrieve pages that contain several needed rows. The degree to which
the physical ordering of rows on disk corresponds to the order of entries in the
index is called clustering. A highly clustered table is one in which the physical
ordering on disk corresponds closely to the index.
An index lookup works down from the root page to a leaf page. The root page,
because it is used so often, is almost always found in a page buffer. The odds of
finding a leaf page in a buffer depend on the size of the index, the form of the
query, and the frequency of column-value duplication. If each value occurs only
Each entry or set of entries with the same value must be located in the index.
Then, for each entry in the index, a random access must be made to the table to
read the associated row. However, if there are many duplicate rows per distinct
index value, and the associated table is highly clustered, the added cost of joining
through the index can be slight.
For more information about the conditions and performance advantages when an
in-place alter occurs, see “Altering a table definition” on page 6-35.
View costs
A complex view could run more slowly than expected.
However, a query against a view might execute more slowly than expected when
the complexity of the view definition causes a temporary table to be created to
process the query. This temporary table is referred to as a materialized view. For
example, you can create a view with a union to combine results from several
SELECT statements.
When you create a view that contains complex SELECT statements, the end user
does not need to handle the complexity. The end user can just write a simple
query, as the following example shows:
SELECT a, b, c, d
FROM view1
WHERE a < 10;
However, this query against view1 might execute more slowly than expected
because the database server creates a fragmented temporary table for the view
before it executes the query.
Another situation when the query might execute more slowly than expected is if
you use a view in an ANSI join. The complexity of the view definition might cause
a temporary table to be created.
To determine if you have a query that must build a temporary table to process the
view, execute the SET EXPLAIN statement. If you see Temp Table For View in the
SET EXPLAIN output file, your query requires a temporary table to process the
view.
Small-table costs
A table is small if it occupies so few pages that it can be retained entirely in the
page buffers. Operations on small tables are generally faster than operations on
large tables.
Data-mismatch costs
An SQL statement can encounter additional costs when the data type of a column
that is used in a condition differs from the definition of the column in the CREATE
TABLE statement.
For example, the following query contains a condition that compares a column to a
data type value that differs from the table definition:
CREATE TABLE table1 (a integer, );
SELECT * FROM table1
WHERE a = ’123’;
The additional costs of a data mismatch are most severe when the query compares
a character column with a noncharacter value and the length of the number is not
equal to the length of the character column. For example, the following query
contains a condition in the WHERE clause that equates a character column to an
integer value because of missing quotation marks:
CREATE TABLE table2 (char_col char(3), );
SELECT * FROM table2
WHERE char_col = 1;
This query finds all of the following values that are valid for char_col:
’ 1’
’001’
’1’
These values are not necessarily clustered together in the index keys. Therefore, the
index does not provide a fast and correct way to obtain the data. The SET
EXPLAIN output shows a sequential scan for this situation.
Warning: The database server does not use an index when the SQL statement
compares a character column with a noncharacter value that is not equal in length
to the character column.
Encrypted-value costs
An encrypted value uses more storage space than the corresponding plain-text
value because all of the information needed to decrypt the value except the
encryption key is stored with the value.
Most encrypted data requires approximately 33 percent more storage space than
unencrypted data. Omitting the hint used with the password can reduce
encryption overhead by up to 50 bytes. If you are using encrypted values, you
must make sure that you have sufficient space available for the values.
For information about the performance degradation that occurs from indexing
some data sets, see “Searching for NCHAR or NVARCHAR columns in an index”
on page 10-27.
If you do not need a non-ASCII collation sequence, use the CHAR and VARCHAR
data types for character columns whenever possible. Because CHAR and
VARCHAR data require simple value-based comparison, sorting and indexing
these columns is less expensive than for non-ASCII data types (NCHAR or
NVARCHAR, for example).
For more information about other character data types, see the IBM Informix GLS
User's Guide.
Network-access costs
Moving data over a network imposes delays in addition to those you encounter
with direct disk access.
Data sent over a network consists of command messages and buffer-sized blocks of
row data. Although the details can differ depending on the network and the
computers, the database server network activity follows a simple model in which
one computer, the client, sends a request to another computer, the server. The server
replies with a block of data from a table.
Whenever data is exchanged over a network, delays are inevitable in the following
situations:
v When the network is busy, the client must wait its turn to transmit. Such delays
are usually less than a millisecond. However, on a heavily loaded network, these
delays can increase exponentially to tenths of seconds and more.
v When the server is handling requests from more than one client, requests might
be queued for a time that can range from milliseconds to seconds.
v When the server acts on the request, it incurs the time costs of disk access and
in-memory operations that the preceding sections describe.
Network access time is extremely variable. In the best case, when neither the
network nor the server is busy, transmission and queuing delays are insignificant,
and the server sends a row almost as quickly as a local database server might.
Furthermore, when the client asks for a second row, the page is likely to be in the
page buffers of the server.
Unfortunately, as network load increases, all these factors tend to worsen at the
same time. Transmission delays rise in both directions, which increases the queue
at the server. The delay between requests decreases the likelihood of a page
remaining in the page buffer of the responder. Thus, network-access costs can
change suddenly and quite dramatically.
If you use the SELECT FIRST n clause in a distributed query, you will still see only
the requested amount of data. However, the local database server does not send
the SELECT FIRST n clause to the remote site. Therefore, the remote site might
return more data.
The optimizer that the database server uses assumes that access to a row over the
network takes longer than access to a row in a local database. This estimate
includes the cost of retrieving the row from disk and transmitting it across the
network.
For information about actions that might improve performance across the network,
see the following sections:
v “Optimizer estimates of distributed queries” on page 13-29
v “MaxConnect for multiple connections UNIX” on page 3-26
v “Multiplexed connections and CPU utilization” on page 3-25
v “Network buffer pools” on page 3-17
The topics in this section contain information about how and when the database
server optimizes and executes these routines.
SQL optimization
If an SPL routine contains SQL statements, at some point the query optimizer
evaluates the possible query plans for SQL in the SPL routine and selects the query
plan with the lowest cost. The database server puts the selected query plan for
each SQL statement in an execution plan for the SPL routine.
When you create an SPL routine with the CREATE PROCEDURE statement, the
database server attempts to optimize the SQL statements within the SPL routine at
that time. If the tables cannot be examined at compile time (because they do not
exist or are not available), the creation does not fail. In this case, the database
server optimizes the SQL statements the first time that the SPL routine executes.
The database server stores the optimized execution plan in the sysprocplan system
catalog table for use by other processes. In addition, the database server stores
information about the SPL routine (such as procedure name and owner) in the
sysprocedures system catalog table and an ASCII version of the SPL routine in the
sysprocbody system catalog table.
Figure 10-18 summarizes the information that the database server stores in system
catalog tables during the compilation process.
To display the query plan, execute the SET EXPLAIN ON statement prior to one of
the following SQL statements that always tries to optimize the SPL routine:
v CREATE PROCEDURE
v UPDATE STATISTICS FOR PROCEDURE
For example, use the following statements to display the query plan for an SPL
routine:
Automatic reoptimization
In some situations, the database server reoptimizes an SQL statement the next time
an SPL routine.
The database server uses a dependency list to keep track of changes that would
cause reoptimization the next time that an SPL routine executes.
The database server reoptimizes an SQL statement the next time an SPL routine
executes after one of the following situations:
v Execution of any data definition language (DDL) statement (such as ALTER
TABLE, DROP INDEX, and CREATE INDEX) that might alter the query plan
v Alteration of a table that is linked to another table with a referential constraint
(in either direction)
v Execution of UPDATE STATISTICS FOR TABLE for any table involved in the
query
The UPDATE STATISTICS FOR TABLE statement changes the version number of
the specified table in systables.
v Renaming a column, database, or index with the RENAME statement
Whenever the SPL routine is reoptimized, the database server updates the
sysprocplan system catalog table with the reoptimized execution plan.
If you do not want to incur the cost of automatic reoptimization when you first
execute an SPL routine after one of the situations that “Automatic reoptimization”
lists, execute the UPDATE STATISTICS statement with the FOR PROCEDURE
clause immediately after the situation occurs. In this way, the SPL routine is
reoptimized before any users execute it.
To prevent unnecessary reoptimization of all SPL routines, ensure that you specify
a specific procedure name in the FOR PROCEDURE clause.
UPDATE STATISTICS for procedure myroutine;
For guidelines to run UPDATE STATISTICS, see “Update statistics when they are
not generated automatically” on page 13-11.
For SPL routines that remain unchanged or change only slightly and that contain
complex SELECT statements, you might want to set the SET OPTIMIZATION
statement to HIGH when you create the SPL routine. This optimization level stores
the best query plans for the SPL routine. Then set optimization to LOW before you
execute the SPL routine. The SPL routine then uses the optimal query plans and
runs at the more cost-effective rate if reoptimization occurs.
When another user executes an SPL routine, the database server first checks the
UDR cache. SPL execution performance improves when the database server can
execute the SPL routine from the UDR cache. The UDR cache also stores UDRs,
user-defined aggregates, and extended data types definitions.
Related reference:
“Configure and monitor memory caches” on page 4-21
The database server uses a hashing algorithm to store and locate SPL routines in
the UDR cache. You can modify the number of buckets in the UDR cache with the
Chapter 10. Queries and the query optimizer 10-33
PC_HASHSIZE configuration parameter. For example, if the value of the
PC_POOLSIZE configuration parameter is 100 and the value of the PC_HASHSIZE
configuration parameter is 10, each bucket can have up to 10 SPL routines and
UDRs.
Too many buckets cause the database server to move out cached SPL routines
when the bucket fills. Too few buckets increase the number of SPL routines in a
bucket, and the database server must search though the SPL routines in a bucket to
determine if the SPL routine that it needs is there.
When the number of entries in a bucket reaches 75 percent, the database server
removes the least recently used SPL routines from the bucket (and hence from the
UDR cache) until the number of SPL routines in the bucket is 50 percent of the
maximum SPL routines in the bucket.
Trigger execution
A trigger is a database object that automatically executes one or more SQL
statements (the triggered action) when a specified data manipulation language
operation (the triggering event) occurs. You can define one or more triggers on a
table to execute after a SELECT, INSERT, UPDATE or DELETE triggering event.
You can also define INSTEAD OF triggers on a view. These triggers specify the
SQL statements to be executed as triggered actions on the underlying table when a
triggering INSERT, UPDATE or DELETE statement attempts to modify the view.
These triggers are called INSTEAD OF triggers because only the triggered SQL
action is executed; the triggering event is not executed. For more information about
using triggers, see the IBM Informix Guide to SQL: Tutorial and information about
the CREATE TRIGGER statement in the IBM Informix Guide to SQL: Syntax.
When you use the CREATE TRIGGER statement to register a new trigger, the
database server:
v Stores information about the trigger in the systriggers system catalog table.
v Stores the text of the statements that the trigger executes in the systrigbody
system catalog table.
The sysprocedures system catalog table identifies trigger routines that can be
invoked only as triggered actions.
For example, if the trigger fires five SQL statements, the client saves at least 10
messages passed between the client and database server (one to send the SQL
statement and one for the reply after the database server executes the SQL
statement). Triggers improve performance the most when they execute more SQL
statements and the network speed is comparatively slow.
When the database server executes an SQL statement, it must perform the
following actions:
v Determine if triggers must be fired
v Retrieve the triggers from systriggers and systrigbody
These operations cause only a slight performance impact that can be offset by the
decreased number of messages passed between the client and the server.
In this case, the database server does not know which columns are affected in the
table hierarchy, so it can execute the query differently. The following behaviors
might occur:
v Key-only index scans are disabled on the table that is involved in a table
hierarchy.
v If the database server needs to sort data selected from a table involved in a table
hierarchy, it copies all of the columns in the SELECT list to the temporary table,
not just the sort columns.
v If the database server uses the table included in the table hierarchy to build a
hash table for a hash join with another table, it bypasses the early projection,
meaning it uses all of the columns from the table to build the hash table, not just
the columns in the join.
v If the SELECT statement contains a materialized view (meaning a temporary
table must be built for the columns in a view) that contains columns from a
table involved in a table hierarchy, all columns from the table are included in the
temporary table, not just the columns actually contained in the view.
In SELECT statements whose tables do not fire SELECT triggers, the database
server sends more than one row back to the client and stores the rows in a buffer
even though the client application requested only one row with a FETCH
statement. However, for SELECT statements that contain one or more tables that
fire a SELECT trigger, the database server sends only the requested row back to the
client instead of a buffer full. The database server cannot return other rows to the
client until the trigger action occurs.
Optimizer directives can either be explicit directions (for example, "use this index"
or "access this table first"), or they can eliminate possible query plans (for example,
"do not read this table sequentially" or "do not perform a nested-loop join").
Client users also specify an environment variable and can choose to use these
optimizer directives in queries in situations when they do not want to insert
comments in SQL statements.
External directives are useful when it is not feasible to rewrite a query for a
short-term solution to a problem, for example, when a query starts to perform
poorly. Rewriting the query by changing the SQL statement is preferable for
long-term solutions to problems.
External directives are for occasional use only. The number of directives stored in
the sysdirectives catalog should not exceed 50. A typical enterprise only needs 0 to
9 directives.
Before you decide when to use optimizer directives, you should understand what
makes a good query plan.
The optimizer creates a query plan based on costs of using different table-access
paths, join orders, and join plans.
For information about query plans, see “The query plan” on page 10-1. For more
information about directives, see
To prepare for using directives, make sure that you perform the following tasks:
v Run UPDATE STATISTICS.
Without accurate statistics, the optimizer cannot choose the appropriate query
plan. Run UPDATE STATISTICS any time that the data in the tables changes
significantly (many new rows are added, updated, or deleted). For more
information, see “Update the statistics for the number of rows” on page 13-12.
v Create distributions.
One of the first things that you should try when you attempt to improve a slow
query is to create distributions on columns involved in a query. Distributions
provide the most accurate information to the optimizer about the nature of the
data in the table. Run UPDATE STATISTICS HIGH on columns involved in the
query filters to see if performance improves. For more information, see
“Creating data distributions” on page 13-13.
In some cases, the query optimizer does not choose the best query plan because of
the complexity of the query or because (even with distributions) it does not have
enough information about the nature of the data. In these cases, you can attempt to
improve performance for a particular query by using directives.
Include the directives in the SQL statement as a comment that occurs immediately
after the SELECT, UPDATE, or DELETE keyword. The first character in a directive
is always a plus (+) sign. In the following query, the ORDERED directive specifies
For a complete syntax description for directives, see the IBM Informix Guide to SQL:
Syntax.
To influence the choice of a query plan that the optimizer makes, you can alter the
following aspects of a query:
v Access method
v Join order
v Join method
v Optimization goal
v Star-join directives
You can also use EXPLAIN directives instead of the SET EXPLAIN statement to
show the query plan. The following sections describe these aspects in detail.
Access-method directives
The database server uses an access method to access a table. The server can either
read the table sequentially via a full table scan or use any one of the indexes on
the table. Access-method directives influence the access method.
The following table lists the directives that influence the access method:
In some cases, forcing an access method can change the join method that the
optimizer chooses. For example, if you exclude the use of an index with the
AVOID_INDEX directive, the optimizer might choose a hash join instead of a
nested-loop join.
The optimizer considers an index self-join path only if all of the following
conditions are met:
v The index does not have functional keys, user-defined types, built-in opaque
types, or non-B-tree indexes
v Data distribution statistics are available for the index key column under
consideration
v The number of rows in the table is at least 10 times the number of unique
combinations of all possible lead-key column values.
If all of these conditions are met, the optimizer estimates the cost of an index
self-join path and compares it with the costs of alternative access methods. The
optimizer then picks the best access method for the table. For more information
about the access-method directives and some examples of their usage, see the IBM
Informix Guide to SQL: Syntax.
Join-order directives
The join-order directive ORDERED tells the optimizer to join tables in the order
that the SELECT statement lists them.
In this example, the optimizer chooses to join the tables with a hash join. However,
if you arrange the order so that the second table is employee (and must be
accessed by an index), the hash join is not feasible.
SELECT --+ORDERED, AVOID_FULL(e)
* FROM department d, employee e
WHERE e.dept_no = d.dept_no AND e.salary > 5000;
Join-method directives
The join-method directives influence how the database server joins two tables in a
query.
The following directives influence the join method between two tables:
v USE_NL
You can specify the keyword /BUILD after the name of a table in a USE_HASH or
AVOID_HASH optimizer directives:
v With USE_HASH directives, the /BUILD keyword tells the optimizer to use the
specified table to build the hash table.
v With AVOID_HASH, the /BUILD keyword tells the optimizer to avoid using the
specified table to build the hash table.
You can specify the keyword /PROBE after the name of a table in a USE_HASH or
AVOID_HASH optimizer directives:
v With USE_HASH directives, the /PROBE keyword tells the optimizer to use the
specified table to probe the hash table.
v With AVOID_HASH directives, the /PROBE keyword tells the optimizer to
avoid using the specified table to probe the hash table.
Optimization-goal directives
In some queries, you might want to find only the first few rows in the result of a
query. Or, you might know that all rows must be accessed and returned from the
query. You can use the optimization-goal directives to find the first row that
satisfies the query or all rows that satisfy the query.
For example, you might want to find only the first few rows in the result of a
query, because an Informix ESQL/C program opens a cursor for the query and
performs a FETCH to find only the first row.
Use the optimization-goal directives to optimize the query for either one of these
cases:
v FIRST_ROWS
Choose a plan that optimizes the process of finding only the first row that
satisfies the query.
v ALL_ROWS
Choose a plan that optimizes the process of finding all rows (the default
behavior) that satisfy the query.
If you use the FIRST_ROWS directive, the optimizer might abandon a query plan
that contains activities that are time-consuming up front. For example, a hash join
might take too much time to create the hash table. If only a few rows must be
returned, the optimizer might choose a nested-loop join instead.
Star-join directives
Star-join directives can specify how the query optimizer joins two or more tables,
among which one or more dimension tables have foreign-key dependencies on one
or more fact tables.
The following directives can influence the join plan for tables that logically
participate in a star schema or in a snowflake schema:
v FACT
The optimizer considers a query plan in which the specified table is a fact table
in a star-join execution plan.
v AVOID_FACT
The optimizer does not consider a star-join execution plan that treats the
specified table (or any of the tables in the list of tables) as a fact table.
v STAR_JOIN
The optimizer favors a star-join execution plan, if available.
v AVOID_STAR_JOIN
The optimizer chooses a query execution plan that is not a star-join plan.
These star-join directives have no effect unless the parallel database query feature
(PDQ) is enabled.
Related concepts:
Concepts of dimensional data modeling (Data Warehouse Guide)
Keys to join the fact table with the dimension tables (Data Warehouse Guide)
Use the snowflake schema for hierarchical dimension tables (Data Warehouse
Guide)
Related reference:
Star-Join Directives (SQL Syntax)
EXPLAIN directives
You can use the EXPLAIN directives to display the query plan that the optimizer
chooses, and you can specify to display the query plan without running the query.
When you want to display the query plan for one SQL statement only, use these
EXPLAIN directives instead of the SET EXPLAIN ON or SET EXPLAIN ON
AVOID_EXECUTE statements.
When you use AVOID_EXECUTE (either the directive or in the SET EXPLAIN
statement), the query does not execute but displays the following message:
No rows returned.
Figure 11-1 shows sample output for a query that uses the EXPLAIN
AVOID_EXECUTE directive.
QUERY:
------
select --+ explain avoid_execute
l.customer_num, l.lname, l.company,
l.phone, r.call_dtime, r.call_descr
from customer l, cust_calls r
where l.customer_num = r.customer_num
DIRECTIVES FOLLOWED:
EXPLAIN
AVOID_EXECUTE
DIRECTIVES NOT FOLLOWED:
Estimated Cost: 7
Estimated # of Rows Returned: 7
The following table describes the pertinent output lines in Figure 11-1 that describe
the chosen query plan.
The following example shows how directives can alter the query plan.
You run the query with SET EXPLAIN ON to display the query path that the
optimizer uses.
QUERY:
------
SELECT * FROM emp,job,dept
WHERE emp.location = "DENVER"
AND emp.jobno = job.jobno
AND emp.deptno = dept.deptno
AND dept.location = "DENVER"
Estimated Cost: 5
Estimated # of Rows Returned: 1
The diagram in Figure 11-2 shows a possible query plan for this query. The query
plan has three levels of information: (1) a nested-loop join, (2) an index scan on
one table and a nested-loop join, and (3) index scans on two other tables.
Nested-loop join
Nested-loop join
Index scan
job
with ix2
Perhaps you are concerned that using a nested-loop join might not be the fastest
method to execute this query. You also think that the join order is not optimal. You
can force the optimizer to choose a hash join and order the tables in the query plan
according to their order in the query, so the optimizer uses the query plan that
Figure 11-3 shows. This query plan that has three levels of information: (1) a hash
join, (2) an index scan and a hash join, and (3) an index scan on two other tables.
Index scan
emp job Full table scan
with ix1
To force the optimizer to choose the query plan that uses hash joins and the order
of tables shown in the query, use the directives that the following partial SET
EXPLAIN output shows:
QUERY:
------
SELECT {+ORDERED,
INDEX(emp ix1),
FULL(job),
DIRECTIVES FOLLOWED:
ORDERED
INDEX ( emp ix1 )
FULL ( job )
USE_HASH ( job/BUILD )
USE_HASH ( dept/BUILD )
INDEX ( dept ix3 )
DIRECTIVES NOT FOLLOWED:
Estimated Cost: 7
Estimated # of Rows Returned: 1
Any directives in an SQL statement take precedence over the join plan that the
OPTCOMPIND configuration parameter forces. For example, if a query includes
the USE_HASH directive and OPTCOMPIND is set to 0 (nested-loop joins
preferred over hash joins), the optimizer uses a hash join.
The optimizer creates a query plan for a SELECT statement in an SPL routine
when the database server creates an SPL routine or during the execution of some
versions of the UPDATE STATISTICS statement.
The optimizer reads and applies directives at the time that it creates the query
plan. Because it stores the query plan in a system catalog table, the SELECT
statement is not reoptimized when it is executed. Therefore, settings of
IFX_DIRECTIVES and DIRECTIVES affect SELECT statements inside an SPL
routine when they are set at any of the following times:
v Before the CREATE PROCEDURE statement
v Before the UPDATE STATISTICS statements that cause SQL in SPL to be
optimized
v During certain circumstances when SELECT statements have variables supplied
at runtime
This error can occur with explicitly prepared statements. These statements have the
following form:
PREPARE statement id FROM quoted string
After a statement has been prepared in the database server and before execution of
the statement, a table to which the statement refers might have been renamed or
altered, possibly changing the structure of the table. Problems can occur as a result.
Adding an index to the table after preparing the statement can also invalidate the
statement. A subsequent OPEN command for a cursor fails if the cursor refers to
the invalid prepared statement; the failure occurs even if the OPEN command has
the WITH REOPTIMIZATION clause.
If an index was added after the statement was prepared, you must prepare the
statement again and declare the cursor again. You cannot simply reopen the cursor
if it was based on a prepared statement that is no longer valid.
Each SPL routine is optimized the first time that it is run (not when it is created).
This behavior means that an SPL routine might succeed the first time it is run but
fail later under virtually identical circumstances. The failure of an SPL routine can
also be intermittent, because failure during one execution forces an internal
warning to reoptimize the procedure before the next execution.
The database server keeps a list of tables that the SPL routine references explicitly.
Whenever any one of these explicitly referenced tables is modified, the database
server reoptimizes the procedure the next time the procedure is executed.
However, if the SPL routine depends on a table that is referenced only indirectly,
the database server cannot detect the need to reoptimize the procedure after that
table is changed. For example, a table can be referenced indirectly if the SPL
routine invokes a trigger. If a table that is referenced by the trigger (but not
directly by the SPL routine) is changed, the database server does not know that it
should reoptimize the SPL routine before running it. When the procedure is run
after the table has been changed, this error can occur.
To prevent this error, you can force reoptimization of the SPL routine. To force
reoptimization, execute the following statement:
UPDATE STATISTICS FOR PROCEDURE procedure name
You can add this statement to your program in either of the following ways:
v Place the UPDATE STATISTICS statement after each statement that changes the
mode of an object.
v Place the UPDATE STATISTICS statement before each execution of the SPL
routine.
For efficiency, you can put the UPDATE STATISTICS statement with the action that
occurs less frequently in the program (change of object mode or execution of the
procedure). In most cases, the action that occurs less frequently in the program is
the change of object mode.
When you follow this method of recovering from this error, you must execute the
UPDATE STATISTICS statement for each procedure that references the changed
tables indirectly unless the procedure also references the tables explicitly.
You can also recover from this error by simply rerunning the SPL routine. The first
time that the stored procedure fails, the database server marks the procedure as
needing reoptimization. The next time that you run the procedure, the database
server reoptimizes the procedure before running it. However, running the SPL
routine twice might not be practical or safe. A safer choice is to use the UPDATE
STATISTICS statement to force reoptimization of the procedure.
Use the SAVE EXTERNAL DIRECTIVES statement to create the association record
to use for the list of one or more query directives These directives are applied
automatically to subsequent instances of the same query.
The following example shows a SAVE EXTERNAL TABLE statement that registers
an association-record in the system catalog as a new row in the sysdirectives table
that can be used as a query optimizer directive.
SAVE EXTERNAL DIRECTIVES {+INDEX(t1,i11)} ACTIVE FOR
SELECT {+INDEX(t1, i2) } c1 FROM t1 WHERE c1=1;
The following data is stored in the association record that the SQL statement above
defined:
id 16
query select {+INDEX(t1, i2) } c1 from t1 where c1=1
directive INDEX(t1,i11)
directivecode BYTE value
active 1
hashcode -589336273
Value Explanation
0 (default) Off. The directive cannot be enabled, even if IFX_EXTDIRECTIVES is
enabled.
You can also use the EXTDIRECTIVES option of the SET ENVIRONMENT
statement to enable or disable external directives during a session. What you
specify with the EXTDIRECTIVES option overwrites the external directive setting
that is specified in the EXT_DIRECTIVES configuration parameter in the
ONCONFIG file.
To overwrite the value for enabling or disabling the external directive in the
ONCONFIG file:
v To enable the external directives during a session, specify 1, on, or ON as the
value for SET ENVIRONMENT EXTDIRECTIVES.
v To disable the external directives during a session, specify 0, off, or OFF as the
value for SET ENVIRONMENT EXTDIRECTIVES.
The explain output file specifies whether external directives are in effect.
Related concepts:
“The explain output file” on page 10-10
“Query statistics section provides performance debugging information” on page
10-11
“Report that shows the query plan chosen by the optimizer” on page 10-9
Using the FILE TO option (SQL Syntax)
Default name and location of the explain output file on UNIX (SQL Syntax)
Default name and location of the output file on Windows (SQL Syntax)
Related reference:
SET EXPLAIN statement (SQL Syntax)
onmode -Y: Dynamically change SET EXPLAIN (Administrator's Reference)
onmode and Y arguments: Change query plan measurements for a session
(SQL administration API) (Administrator's Reference)
When external directives are enabled and the sysdirectives system catalog table is
not empty,
v the database server compares every query with the query text of every ACTIVE
external directive,
What PDQ is
Parallel database query (PDQ) is a database server feature that can improve
performance dramatically when the server processes queries that decision-support
applications initiate. PDQ enables Informix to distribute the work for one aspect of
a query among several processors. For example, if a query requires an aggregation,
Informix can distribute the work for the aggregation among several processors.
Another database server feature, table fragmentation, allows you to store the parts of
a table on different disks. PDQ delivers maximum performance benefits when the
data that you query is in fragmented tables. For information about how to use
fragmentation for maximum performance, see “Planning a fragmentation strategy”
on page 9-1.
Related concepts:
“Database server operations that use PDQ” on page 12-2
“The allocation of resources for parallel database queries” on page 12-7
“Managing PDQ queries” on page 12-12
“Monitoring resources used for PDQ and DSS queries” on page 12-16
The database server initiates these PDQ threads, which are listed as secondary
threads in the SET EXPLAIN output.
Several producers can supply data to a single consumer. When this situation
occurs, the database server sets up an internal mechanism, called an exchange, that
synchronizes the transfer of data from those producers to the consumer. For
instance, if a fragmented table is to be sorted, the optimizer typically calls for a
separate scan thread for each fragment. Because of different I/O characteristics, the
scan threads can be expected to complete at different times. An exchange is used to
The database server creates these threads and exchanges automatically and
transparently. They terminate automatically as they complete processing for a
given query. The database server creates new threads and exchanges as needed for
subsequent queries.
In the topics on database server operations that use PDQ in this section, a query is
any SELECT statement.
Related concepts:
“What PDQ is” on page 12-1
The database server takes the following two steps to process UPDATE and
DELETE statements:
1. Fetches the qualifying rows.
2. Applies the action of updating or deleting.
The database server performs the first step of an UPDATE or DELETE statement in
parallel, with the following exceptions:
v The targeted table in a DELETE statement has a referential constraint that can
cascade to a child table.
v The UPDATE or DELETE statement contains an OR clause and the optimizer
chooses an OR index to process the OR filter.
v The UPDATE statement contains a subquery that the optimizer converts into a
join.
The types of insert operations that the server performs in parallel are:
v SELECT...INTO TEMP inserts using explicit temporary tables.
v INSERT INTO...SELECT inserts using implicit temporary tables.
For example, the database server can perform the inserts in parallel into the
temporary table, temp_table, as the following example shows:
The database server performs the parallel insert by writing in parallel to each of
the fragments in a round-robin fashion. Performance improves as you increase the
number of fragments.
For example, the database server processes the following INSERT statement in
parallel:
INSERT INTO target_table SELECT * FROM source_table
The database server processes this type of INSERT statement in parallel only when
the target tables meet the following criteria:
v The value of PDQ priority is greater than 0.
v The target table is fragmented into two or more dbspaces.
v The target table has no enabled referential constraints or triggers.
v The target table is not a remote table.
v In a database with logging, the target table does not contain filtering constraints.
v The target table does not contain columns of TEXT or BYTE data type.
The database server does not process parallel inserts that reference an SPL routine.
For example, the database server never processes the following statement in
parallel:
INSERT INTO table1 EXECUTE PROCEDURE ins_proc
When PDQ is in effect, the scans for index builds are controlled by the PDQ
configuration parameters described in “The allocation of resources for parallel
database queries” on page 12-7.
If you have a computer with multiple CPUs, the database server uses two sort
threads to sort the index keys. The database server uses two sort threads during
index builds without the user setting the PSORT_NPROCS environment variable.
The database server can perform the following parallel operations if the UDR is
written and registered appropriately:
v Parallel scans
v Parallel comparisons with the UDR
For more information about how to enable parallel execution of UDRs, see
“Parallel UDRs” on page 13-38.
For example, the server does not process the following types of queries in parallel:
v Queries started with an isolation level of Cursor Stability
Subsequent changes to the isolation level do not affect the parallelism of queries
already prepared. This situation results from the inherent nature of parallel
scans, which scan several rows simultaneously.
v Queries that use a cursor declared as FOR UPDATE
v Queries in the FOR EACH ROW section of the Action clause of a Select trigger
v A DELETE or MERGE statement in the FOR EACH ROW section of the Action clause
of a Delete trigger
v An INSERT or MERGE statement in the FOR EACH ROW section of the Action
clause of an Insert trigger
v An UPDATE or MERGE statement in the FOR EACH ROW section of the Action
clause of an Update trigger
v Data definition language (DDL) statements.
For a complete list of the DDL statements of SQL that Informix supports, see the
IBM Informix Guide to SQL: Syntax.
The database server must allocate the memory that the UPDATE STATISTICS
statement uses for sorting.
If you have an extremely large database and indexes are fragmented, UPDATE
STATISTICS LOW can automatically run statements in parallel. For more
information, see “Update statistics in parallel on very large databases” on page
13-17.
When the database server executes an SPL routine, it does not use PDQ to process
non-related SQL statements contained in the procedure. Each SQL statement can be
executed independently in parallel, however, using intraquery parallelism when
possible. As a consequence, you should limit the use of procedure calls from
within data manipulation language (DML) statements if you want to use the
parallel-processing abilities of the database server. For a complete list of DML
statements, see the IBM Informix Guide to SQL: Syntax.
The database server uses intraquery parallelism to process the statements in the
body of an SQL trigger in the same way that it processes statements in SPL
routines. For restrictions on using PDQ for queries in some triggered actions of
Select, Insert, and Update triggers, see “SQL operations that do not use PDQ” on
page 12-4.
For uncorrelated subqueries, only the first thread that makes the request actually
executes the subquery. Other threads simply use the results of the subquery and
can do so in parallel.
In this case, all local scans are parallel, but all local joins and remote access are
nonparallel.
When your database server system has heavy OLTP use, and you find performance
is degrading, you can use the MGM facilities to limit the resources that are
committed to decision-support queries. During off-peak hours, you can designate a
larger proportion of the resources to parallel processing, which achieves higher
throughput for decision-support queries.
The MGM grants memory to a query for such activities as sorts, hash joins, and
processing of GROUP BY clauses. The amount of memory that decision-support
queries use cannot exceed DS_TOTAL_MEMORY.
To monitor resources that the MGM allocates, run the onstat -g mgm command.
This command shows only the amount of memory that is used; it does not show
the amount of memory that is granted.
The following formula yields the maximum number of scan threads per query:
scan_threads = min (nfrags, DS_MAX_SCANS * (pdqpriority / 100)
* (MAX_PDQPRIORITY / 100))
nfrags Is the number of fragments in the table with the largest number of
fragments.
pdqpriority
Is the value for PDQ priority that is set by either the PDQPRIORITY
environment variable or the SQL statement SET PDQPRIORITY.
The PDQPRIORITY environment variable and the SQL statement SET PDQPRIORITY
request a percentage of PDQ resources for a query. You can use the
MAX_PDQPRIORITY configuration parameter to limit the percentage of the
requested resources that a query can obtain and to limit the impact of
decision-support queries on OLTP processing.
Related concepts:
Chapter 4, “Effect of configuration on memory utilization,” on page 4-1
“Limiting the priority of decision-support queries” on page 12-8
“The DS_TOTAL_MEMORY configuration parameter and memory utilization” on
page 4-12
Related reference:
onstat -g mgm command: Print MGM resource information (Administrator's
Reference)
When the database server uses PDQ to perform a query in parallel, it puts a heavy
load on the operating system. In particular, PDQ exploits the following resources:
v Memory
v CPU VPs
v Disk I/O (to fragmented tables and temporary table space)
v Scan threads
You can control how the database server uses resources in the following ways:
v Limit the priority of parallel database queries.
v Adjust the amount of memory.
v Limit the number of scan threads.
v Limit the number of concurrent queries.
Related concepts:
“What PDQ is” on page 12-1
The default value for the PDQ priority of individual applications is 0, which means
that PDQ processing is not used. The database server uses this value unless one of
the following actions overrides it:
v You set the PDQPRIORITY environment variable.
v The application uses the SET PDQPRIORITY statement.
An application or user can use the DEFAULT tag of the SET PDQPRIORITY
statement to use the value for PDQ priority if the value has been set by the
PDQPRIORITY environment variable. DEFAULT is the symbolic equivalent of a
-1 value for PDQ priority.
You can use the onmode command-line utility to change the values of the
following configuration parameters temporarily:
v Use onmode -M to change the value of DS_TOTAL_MEMORY.
v Use onmode -Q to change the value of DS_MAX_QUERIES.
v Use onmode -D to change the value of MAX_PDQPRIORITY.
v Use onmode -S to change the value of DS_MAX_SCANS.
These changes remain in effect only as long as the database server remains up and
running. When the database server starts, it uses the values listed in the
ONCONFIG file.
For more information about the preceding parameters, see Chapter 4, “Effect of
configuration on memory utilization,” on page 4-1. For more information about
onmode, see your IBM Informix Administrator's Reference.
The MAX_PDQPRIORITY configuration parameter limits the PDQ priority that the
database server grants when users either set the PDQPRIORITY environment
variable or issue the SET PDQPRIORITY statement before they issue a query.
When an application or an end user attempts to set a PDQ priority, the priority
that is granted is multiplied by the value that MAX_PDQPRIORITY specifies.
Set the value of MAX_PDQPRIORITY lower when you want to allocate more
resources to OLTP processing.
Set the value of MAX_PDQPRIORITY higher when you want to allocate more
resources to decision-support processing.
The possible range of values is 0 to 100. This range represents the percent of
resources that you can allocate to decision-support processing.
In this case, set MAX_ PDQPRIORITY to 0, which limits the value of PDQ priority
to OFF. A PDQ priority value of OFF does not prevent decision-support queries
from running. Instead, it causes the queries to run without parallelization. In this
configuration, response times for decision-support queries might be slow.
The database server lowers the PDQ priority of queries that require access to a
remote database (same or different database server instance) to LOW if you set it to
a higher value. In that case, all local scans are parallel, but all local joins and
remote accesses are nonparallel.
The PDQ priority value that the database server uses to optimize or reoptimize an
SQL statement is the value that was set by a SET PDQPRIORITY statement, which
must have been executed within the same procedure. If no such statement has
been executed, the value that was in effect when the procedure was last compiled
or created is used.
The PDQ priority value currently in effect outside a procedure is ignored within a
procedure when it is executing.
It is suggested that you turn PDQ priority off when you enter a procedure and
then turn it on again for specific statements. You can avoid tying up large amounts
of memory for the procedure, and you can make sure that the crucial parts of the
procedure use the appropriate PDQ priority, as the following example illustrates:
CREATE PROCEDURE my_proc (a INT, b INT, c INT)
Returning INT, INT, INT;
SET PDQPRIORITY 0;
...
SET PDQPRIORITY 85;
SELECT ... (big complicated SELECT statement)
SET PDQPRIORITY 0;
...
;
Use the following formula as a starting point for estimating the amount of shared
memory to allocate to DSS queries:
DS_TOTAL_MEMORY = p_mem - os_mem - rsdnt_mem -
(128 kilobytes * users) - other_mem
p_mem represents the total physical memory that is available on the host
computer.
os_mem
represents the size of the operating system, including the buffer cache.
resdnt_mem
represents the size of Informix resident shared memory.
users is the number of expected users (connections) specified in the NETTYPE
configuration parameter.
other_mem
is the size of memory used for other (non-IBM Informix) applications.
The value for DS_TOTAL_MEMORY that is derived from this formula serves only
as a starting point. To arrive at a value that makes sense for your configuration,
you must monitor paging and swapping. (Use the tools provided with your
operating system to monitor paging and swapping.) When paging increases,
decrease the value of DS_TOTAL_MEMORY so that processing the OLTP workload
can proceed.
The amount of memory that is granted to a single parallel database query depends
on many system factors, but in general, the amount of memory granted to a single
parallel database query is proportional to the following formula:
memory_grant_basis = (DS_TOTAL_MEMORY/DS_MAX_QUERIES) *
(PDQPRIORITY / 100) *
(MAX_PDQPRIORITY / 100)
However, if the currently executing queries on all databases of the server instance
require more memory than this estimate of the average allocation, another query
might overflow to disk or might wait until concurrent queries completed execution
and released sufficient memory resources for running the query. The following
alternative formula estimates the PDQ memory available for a single query
directly:
memory_for_single_query = DS_TOTAL_MEMORY *
(PDQPRIOIRTY / 100) *
(MAX_PDQPRIORITY / 100)
For example, suppose a large table contains 100 fragments. With no limit on the
number of concurrent scans allowed, the database server would concurrently
execute 100 scan threads to read this table. In addition, many users could initiate
this query.
To estimate the number of decision-support (DSS) queries that the database server
can run concurrently, count each query that runs with PDQ priority set to 1 or
greater as one full query.
The database server allocates less memory to queries that run with a lower
priority, so you can assign lower-priority queries a PDQ priority value that is
between 1 and 30, depending on the resource impact of the query. The total
number of queries with PDQ priority values greater than 0 cannot exceed the
value of DS_MAX_QUERIES.
You can restructure a query or use OPTCOMPIND to change how the optimizer
treats the query.
Set OPTCOMPIND to 0 if you want the database server to select a join plan
exactly as it did in versions of the database server prior to version 6.0. This option
ensures compatibility with previous versions of the database server.
An application with an isolation mode of Repeatable Read can lock all records in a
table when it performs a hash join. For this reason, you should set OPTCOMPIND
to 1.
If you want the optimizer to make the determination for you based on cost,
regardless of the isolation level of applications, set OPTCOMPIND to 2.
You can use the SET ENVIRONMENT OPTCOMPIND command to change the
value of OPTCOMPIND within a session. For more information about using this
command, see “Setting the value of OPTCOMPIND within a session” on page 3-11.
For more information about OPTCOMPIND and the different join plans, see “The
query plan” on page 10-1.
The PDQ priority set with the SET PDQPRIORITY statement supersedes the
PDQPRIORITY environment variable.
The DEFAULT tag for the SET PDQPRIORITY statement allows an application to
revert to the value for PDQ priority as set by the environment variable, if any. For
more information about the SET PDQPRIORITY statement, see the IBM Informix
Guide to SQL: Syntax.
To enable the database server to determine memory allocations for queries and to
distribute memory among query operators according to their needs, enter the
following statement before you issue the query:
SET ENVIRONMENT IMPLICIT_PDQ ON;
If you instead set the IMPLICIT_PDQ value to an integer in the range from 1 to
100, the database server scales its estimate by the specified value. If you set a low
value, the amount of memory assigned to the query is reduced, which might
increase the amount of query-operator overflow.
For example, the following statement forces the database server to use explicit
PDQPRIORITY values as guidelines in allocating memory, if the IMPLICIT_PDQ
session environment option has already been set:
SET ENVIRONMENT BOUND_IMPL_PDQ ON;
If you instead specify a positive integer in the range from 1 to 100, the explicit
PDQPRIORITY value is scaled by that setting during the current session.
The resources that you request and the amount of resources that the database
server allocates for the query can differ. This difference occurs when the database
server administrator uses the MAX_PDQPRIORITY configuration parameter to put
a ceiling on user-requested resources, as the following topic explains.
You can also limit the number of concurrent decision-support scans that the
database server allows by setting the DS_MAX_SCANS configuration parameter.
You can use the onstat -g mgm command to monitor how the Memory Grant
Manager (MGM) coordinates memory use and to scan threads.
Related reference:
onstat -g mgm command: Print MGM resource information (Administrator's
Reference)
The onstat -u option lists all the threads for a session. If a session is running a
decision-support query, the output lists the primary thread and any additional
threads. For example, session 10 in Figure 12-1 on page 12-17 has a total of five
threads running.
The onstat -g ath output also lists these threads and includes a name column that
indicates the role of the thread. Threads that a primary decision-support thread
started have a name that indicates their role in the decision-support query. For
example, Figure 12-2 lists four scan threads, started by a primary thread (sqlexec).
Threads:
tid tcb rstcb prty status vp-class name
...
11 994060 0 4 sleeping(Forever) 1cpu kaio
12 994394 80f2a4 2 sleeping(secs: 51) 1cpu btclean
26 99b11c 80f630 4 ready 1cpu onmode_mon
32 a9a294 812b64 2 ready 1cpu sqlexec
113 b72a7c 810b78 2 ready 1cpu sqlexec
114 b86c8c 81244c 2 cond wait(netnorm) 1cpu sqlexec
115 b98a7c 812ef0 2 cond wait(netnorm) 1cpu sqlexec
116 bb4a24 80fd48 2 cond wait(netnorm) 1cpu sqlexec
117 bc6a24 81161c 2 cond wait(netnorm) 1cpu sqlexec
118 bd8a24 811290 2 ready 1cpu sqlexec
119 beae88 810f04 2 cond wait(await_MC1) 1cpu scan_1.0
120 a8ab48 8127d8 2 ready 1cpu scan_2.0
121 a96850 810460 2 ready 1cpu scan_2.1
122 ab6f30 8119a8 2 running 1cpu scan_2.2
If PDQ is turned on, the optimizer also indicates the maximum number of threads
that are required to answer the query. The # of Secondary Threads field in the SET
EXPLAIN output indicates the number of threads that are required in addition to
your user session thread. The total number of threads necessary is the number of
secondary threads plus 1.
The following example shows SET EXPLAIN output for a table with fragmentation
and PDQ priority set to LOW:
SELECT * FROM t1 WHERE c1 > 20
Estimated Cost: 2
Estimated # of Rows Returned: 2
# of Secondary Threads = 1
The following example of partial SET EXPLAIN output shows a query with a hash
join between two fragmented tables and PDQ priority set to ON. The query is
marked with DYNAMIC HASH JOIN, and the table on which the hash is built is
marked with Build Outer.
QUERY:
------
SELECT h1.c1, h2.c1 FROM h1, h2 WHERE h1.c1 = h2.c1
Estimated Cost: 2
Estimated # of Rows Returned: 5
# of Secondary Threads = 6
The following example of partial SET EXPLAIN output shows a table with
fragmentation, PDQ priority set to LOW, and an index that was selected as the
access plan:
SELECT * FROM t1 WHERE c1 < 13
Estimated Cost: 2
Estimated # of Rows Returned: 1
# of Secondary Threads = 3
Even if you use your database server as a data warehouse, you might sometimes
test queries on a separate system until you understand the tuning issues that are
relevant to the query.
If you are trying to improve performance of a large query, one that might take
several minutes or hours to complete, you can prepare a scaled-down database in
which your tests can complete more quickly. However, be aware of these potential
problems:
v The optimizer can make different choices in a small database than in a large one,
even when the relative sizes of tables are the same. Verify that the query plan is
the same in the real and the model databases.
v Execution time is rarely a linear function of table size. For example, sorting time
increases faster than table size, as does the cost of indexed access when an index
goes from two to three levels. What appears to be a big improvement in the
scaled-down environment can be insignificant when applied to the full database.
Therefore, any conclusion that you reach as a result of tests in the model database
must be tentative until you verify them in the production database.
You can often improve performance by adjusting your query or data model with
the following goals in mind:
v If you are using a multiuser system or a network, where system load varies
widely from hour to hour, try to perform your experiments at the same time
each day to obtain repeatable results. Start tests when the system load is
consistently light so that you are truly measuring the impact of your query only.
v If the query is embedded in a complicated program, you can extract the SELECT
statement and embed it in a DB-Access script.
Related concepts:
Tune the new version for performance and adjust queries (Migration Guide)
After you study the query plan, examine your data model to see if the changes this
chapter suggests will improve the query.
For more information about these EXPLAIN directives, see “EXPLAIN directives”
on page 11-8.
To control the amount of information that the query evaluates, use the WHERE
clause of the SELECT statement. The conditional expression in the WHERE clause
is commonly called a filter.
For information about how filter selectivity affects the query plan that the
optimizer chooses, see “Filters in the query” on page 10-20. The following sections
provide some guidelines to improve filter selectivity.
You can improve the selectivity if the UDRs have the following features:
v Functional indexes
You can create a functional index on the resulting values of a user-defined routine
or a built-in function that operates on one or more columns. When you create a
functional index, the database server computes the return values of the function
and stores them in the index. The database server can locate the return value of
the function in an appropriate index without executing the function for each
qualifying column.
For more information about indexing user-defined functions, see “Using a
functional index” on page 7-25.
v User-defined selectivity functions
You can write a function that calculates the expected fraction of rows that
qualify for the function. For a brief description of user-defined selectivity
functions, see “Selectivity and cost functions” on page 13-39. For more
information about how to write and register user-defined selectivity functions,
see IBM Informix User-Defined Routines and Data Types Developer's Guide.
You cannot use an index with such a filter, so the table in this example must be
accessed sequentially.
Regular-expression tests with wildcards in the middle or at the end of the operand
do not prevent the use of an index when one exists.
For example, in the following code, a noninitial substring requires the database
server to test every value in the column:
SELECT * FROM customer
WHERE zipcode[4,5] > ’50’
The optimizer uses an index to process a filter that tests an initial substring of an
indexed column. However, the presence of the substring test can interfere with the
use of a composite index to test both the substring column and another column.
For more information about this ANSI join syntax, see the IBM Informix Guide to
SQL: Syntax.
In an ANSI outer join, the database server takes the following actions to process
the filters:
When distributed queries that use ANSI-compliant LEFT OUTER syntax for
specifying joined tables and nested loop joins are executed, the query is sent to
each participating database server for operations on local tables of those servers.
For example, the demonstration database has the customer table and the cust_calls
table, which tracks customer calls to the service department. Suppose a certain call
code had many occurrences in the past, and you want to see if calls of this kind
have decreased. To see if customers no longer have this call code, use an outer join
to list all customers.
Figure 13-1 shows a sample SQL statement to accomplish this ANSI join query and
the SET EXPLAIN ON output for it.
QUERY:
------
SELECT c.customer_num, c.lname, c.company,
c.phone, u.call_dtime, u.call_code, u.call_descr
FROM customer c
LEFT JOIN cust_calls u ON c.customer_num = u.customer_num
ORDER BY u.call_dtime
Estimated Cost: 14
Estimated # of Rows Returned: 29
Temporary Files Required For: Order By
ON-Filters:virginia.c.customer_num = virginia.u.customer_num
NESTED LOOP JOIN(LEFT OUTER JOIN)
Look at the following lines in the SET EXPLAIN ON output in Figure 13-1:
Figure 13-2 shows the SET EXPLAIN ON output for an ANSI join with a join filter
that checks for calls with the I call_code.
QUERY:
------
SELECT c.customer_num, c.lname, c.company,
c.phone, u.call_dtime, u.call_code, u.call_descr
FROM customer c LEFT JOIN cust_calls u
ON c.customer_num = u.customer_num
AND u.call_code = ’I’
ORDER BY u.call_dtime
Estimated Cost: 13
Estimated # of Rows Returned: 25
Temporary Files Required For: Order By
ON-Filters:(virginia.c.customer_num = virginia.u.customer_num
AND virginia.u.call_code = ’I’ )
NESTED LOOP JOIN(LEFT OUTER JOIN)
Figure 13-2. SET EXPLAIN ON output for a join filter in an ANSI join
The main differences between the output in Figure 13-1 on page 13-4 and
Figure 13-2 are as follows:
v The optimizer chooses a different index to scan the inner table.
This new index exploits more filters and retrieves a smaller number of rows.
Consequently, the join operates on fewer rows.
v The ON clause join filter contains an additional filter.
The value in the Estimated # of Rows Returned line is only an estimate and does
not always reflect the actual number of rows returned. The sample query in
Figure 13-2 returns fewer rows than the query in Figure 13-1 on page 13-4 because
of the additional filter.
Figure 13-3 on page 13-6 shows the SET EXPLAIN ON output for an ANSI join
query that has a filter in the WHERE clause.
Estimated Cost: 3
Estimated # of Rows Returned: 1
Temporary Files Required For: Order By
ON-Filters:(virginia.c.customer_num = virginia.u.customer_num
AND virginia.u.call_code = ’I’ )
NESTED LOOP JOIN(LEFT OUTER JOIN)
PostJoin-Filters:virginia.c.zipcode = ’94040’
Figure 13-3. SET EXPLAIN ON output for the WHERE clause filter in an ANSI join
The main differences between the output in Figure 13-2 on page 13-5 and
Figure 13-3 are as follows:
v The index on the zipcode column in the post-join filter is chosen for the
dominant table.
v The PostJoin-Filters line shows the filter in the WHERE clause.
The AUS maintenance system updates the statistics for tables that are in logged
databases, regardless of the database locale. By making current table statistics
available to the query optimizer, the AUS maintenance system can reduce the risk
of performance degradation from inefficient query plans.
Depending on your system, you might need to adjust the AUS expiration policies
or schedule. The AUS maintenance system resides in the sysadmin database.
You can also view and adjust the AUS maintenance system for table statistics in
the IBM OpenAdmin Tool (OAT) for Informix.
Related concepts:
The Scheduler tasks, sensors, thresholds, and tables reside in the sysadmin
database. By default, only user informix is granted access to the sysadmin
database.
The following table describes the tasks, sensors, thresholds, tables, and views in
the sysadmin database that comprise the AUS maintenance system.
Table 13-1. AUS components
Component Type Description
mon_table_profile sensor Compiles table profile information, including
the total number of updates, inserts, and deletes
that occurred on each table.
For information about other features of the Scheduler, see its description in the
IBM Informix Administrator's Guide. For information about the sysadmin database,
see the IBM Informix Administrator's Reference.
The ph_threshold table of the sysadmin database stores the following thresholds
for defining AUS expiration policies:
Table 13-2. AUS expiration policy thresholds
Threshold Name Default Value Description
AUS_AGE 30 (days) A time-based expiration policy. Statistics
or distributions are updated for a table
after this amount of time regardless of
how much data has changed.
To change the value of an expiration policy, update the value column in the
ph_threshold table in the sysadmin database.
The new threshold takes effect the next time the Auto Update Statistics Evaluator
task runs.
To see all UPDATE STATISTICS statements that were run successfully in the
previous 30 days, run this statement:
SELECT * FROM aus_cmd_comp;
To view all UPDATE STATISTICS statements that failed, run this statement:
SELECT aus_cmd_exe, aus_cmd_err_sql, aus_cmd_err_isam
FROM aus_command
WHERE aus_cmd_state = "E";
You can also see this information in the IBM OpenAdmin Tool (OAT) for Informix.
Rescheduling AUS
You can change when and for how long the Auto Update Statistics Refresh task
runs.
To change the schedule of the Auto Update Statistics Refresh task, update the
ph_task table where the value of the tk_name column is Auto Update Statistics
Refresh.
The following example changes the ending time of the task to 6:00 AM:
The following example changes the days that the task is run to every day of the
week (Saturday and Sunday are enabled by default):
UPDATE ph_task
SET tk_monday = "T",
SET tk_tuesday = "T",
SET tk_wednesday = "T",
SET tk_thursday = "T",
SET tk_friday = "T",
WHERE tk_name = "Auto Update Statistics Refresh";
Disabling AUS
You can prevent statistics from being updated automatically by disabling the AUS
maintenance system.
To disable AUS, you must disable both the Auto Update Statistics Evaluation task
and the Auto Update Statistics Refresh task:
1. Update the value of the tk_enable column of the ph_task table to F where the
value of the tk_name column is Auto Update Statistics Evaluation.
2. Update the value of the tk_enable column of the ph_task table to F where the
value of the tk_name column is Auto Update Statistics Refresh.
UPDATE ph_task
SET tk_enable = "F"
WHERE tk_name = "Auto Update Statistics Refresh";
Important: You do not need to run UPDATE STATISTICS operations when the
statistics are generated automatically.
To ensure that the optimizer selects a query plan that best reflects the current state
of your tables, run UPDATE STATISTICS at regular intervals when the statistics are
not generated automatically.
For information about the specific statistics that the database server keeps in the
system catalog tables, see “Statistics held for the table and index” on page 10-20.
Related concepts:
“Automatic statistics updating” on page 13-6
Related reference:
UPDATE STATISTICS statement (SQL Syntax)
If the cardinality of a table changes often, run the statement more often for that
table.
To drop the old distribution structure in the sysdistrib system catalog table, run
this statement:
UPDATE STATISTICS DROP DISTRIBUTIONS;
You do this using the DROP DISTRIBUTIONS ONLY option in the UPDATE
STATISTICS statement. Using the DROP DISTRIBUTIONS ONLY option can result
in faster performance because the database server does not gather the table and
index statistics that the LOW mode option generates when the ONLY keyword
does not follow the DROP DISTRIBUTIONS keywords.
For detailed information about how to use the DROP DISTRIBUTIONS ONLY
option, see the IBM Informix Guide to SQL: Syntax.
(You do not need to run UPDATE STATISTICS operations when the statistics are
generated automatically.)
The database server creates data distributions, which provide information to the
optimizer, any time the UPDATE STATISTICS MEDIUM or UPDATE STATISTICS
HIGH command is executed.
Important:
The database server creates data distributions by sampling a column's data, sorting
the data, building distributions bins, and inserting the results into the sysdistrib
system catalog table.
You can control the sample size for the scan through the keyword HIGH or
MEDIUM. The difference between UPDATE STATISTICS HIGH and UPDATE
STATISTICS MEDIUM is the number of rows sampled. UPDATE STATISTICS
HIGH scans the entire table, while UPDATE STATISTICS MEDIUM samples only a
subset of rows, based on the confidence and resolution used by the UPDATE
STATISTICS statement.
You can use the LOW keyword with the UPDATE STATISTICS statement only for
fully qualified index keys.
When you use data-distribution statistics for the first time, try to update statistics
in MEDIUM mode for all your tables and then update statistics in HIGH mode for
all columns that head indexes. This strategy produces statistical query estimates for
the columns that you specify. These estimates, on average, have a margin of error
less than percent of the total number of rows in the table, where percent is the value
that you specify in the RESOLUTION clause in the MEDIUM mode. The default
percent value for MEDIUM mode is 2.5 percent. (For columns with HIGH mode
distributions, the default resolution is 0.5 percent.)
With the DISTRIBUTIONS ONLY option, you can execute UPDATE STATISTICS
MEDIUM at the table level or for the entire system because the overhead of the
extra columns is not large.
The database server uses the storage locations that the DBSPACETEMP
environment variable specifies only when you use the HIGH option of UPDATE
STATISTICS.
You can prevent UPDATE STATISTICS operations from using indexes when sorting
rows by setting the third parameter of the DBUPSPACE environment variable to a
value of 1.
For each table that your query accesses, build data distributions according to the
following guidelines. Also see the examples below the guidelines.
To build data distributions for each table that your query accesses:
1. Run a single UPDATE STATISTICS MEDIUM for all columns in a table that do
not head an index.
Use the default parameters unless the table is very large, in which case you
should use a resolution of 1.0 and confidence of 0.99.
2. Run the following UPDATE STATISTICS statement to create distributions for
non-index join columns and non-index filter columns:
UPDATE STATISTICS MEDIUM DISTRIBUTIONS ONLY;
3. Run UPDATE STATISTICS HIGH for all columns that head an index. For the
fastest execution time of the UPDATE STATISTICS statement, you must execute
one UPDATE STATISTICS HIGH statement for each column, as shown in the
example below this procedure.
4. If you have indexes that begin with the same subset of columns, run UPDATE
STATISTICS HIGH for the first column in each index that differs, as shown in
the second example below this procedure.
5. For each single-column or multi-column index on the table:
Because the statement constructs the statistics only once for each index, these steps
ensure that UPDATE STATISTICS executes rapidly.
Examples
Example of UPDATE STATISTICS HIGH statements for all columns that head
an index
Suppose you have a table t1 with columns a, b, c, d, e, and f with the
following indexes:
CREATE INDEX ix_1 ON t1 (a, b, c, d) ...
CREATE INDEX ix_3 ON t1 (f) ...
Run the following UPDATE STATISTICS statements for the columns that
head an index:
UPDATE STATISTICS HIGH FOR TABLE t1(a);
UPDATE STATISTICS HIGH FOR TABLE t1(f);
These UPDATE STATISTICS HIGH statements replace the distributions
created with the UPDATE STATISTICS MEDIUM statements with high
distributions for index columns.
Example of UPDATE STATISTICS HIGH statements for the first column in each
index that differs:
For example, suppose you have the following indexes on table t1:
CREATE INDEX ix_1 ON t1 (a, b, c, d) ...
CREATE INDEX ix_2 ON t1 (a, b, e, f) ...
CREATE INDEX ix_3 ON t1 (f) ...
Step 3 on page 13-14 executes UPDATE STATISTICS HIGH on column a
and column f. Then run UPDATE STATISTICS HIGH on columns c and e.
UPDATE STATISTICS HIGH FOR TABLE t1(c);
UPDATE STATISTICS HIGH FOR TABLE t1(e);
In addition, you can run UPDATE STATISTICS HIGH on column b, although this
is usually not necessary.
Related concepts:
“Virtual portion of shared memory” on page 4-2
Related reference:
UPDATE STATISTICS statement (SQL Syntax)
Important: If your table is very large, UPDATE STATISTICS with the HIGH mode
can take a long time to execute.
In this example, the join columns are the ssn fields in the employee and address
tables. The data distributions for both of these columns must accurately reflect the
actual data so that the optimizer can correctly determine the best join plan and
execution order.
You cannot use the UPDATE STATISTICS statement to create data distributions for
a table that is external to the current database. For additional information about
data distributions and the UPDATE STATISTICS statement, see the IBM Informix
Guide to SQL: Syntax.
Because information about the nature and use of a user-defined data type (UDT) is
not available to the database server, it cannot collect the colmin and colmax
column of the syscolumns system catalog table for user-defined data types. To
gather statistics for columns with user-defined data types, programmers must write
Because the data distributions for user-defined data types can be large, you can
optionally store them in an sbspace instead of the sysdistrib system catalog table.
To print the data distributions for a column with a user-defined data type, use the
dbschema -hd option.
If you run UPDATE STATISTICS MEDIUM or HIGH, you can set the PDQ priority
to a value that is higher than 10. Because the UPDATE STATISTICS MEDIUM and
HIGH statements perform a large amount of sorting operations, increasing the
PDQ priority to a value that is higher than 10 provides additional memory than
can improve the speed of the sorting operations.
By default, when UPDATE STATISTICS statements run, the database server reads
all index leaf pages in sequence to gather statistics such as the number of leaf
pages, the number of unique lead key values, and cluster information. For a large
index this can take a long time. With sampling, the database server reads a fraction
of the index leaf pages (the sample) and then deduces index statistics based on
statistics gathered from the sample.
A possible trade-off for less time in gathering statistics is the accuracy of the
statistics gathered. If there are significant skews in the data distribution for the
lead index key, the sampling approach can result in a large error margin for the
statistics gathered, which in turn might affect optimizer decisions in query plan
generation.
For example, the following dbschema command produces a list of distributions for
each column of table customer in database vjp_stores with the number of values
in each bin, and the number of distinct values:
dbschema -hd customer -d vjp_stores
Figure 13-4 shows the data distributions for the column zipcode that this
dbschema -hd command produces. Because this column heads the zip_ix index,
UPDATE STATISTICS HIGH was run on it, as the following output line indicates:
High Mode, 0.500000 Resolution
Figure 13-4 shows 17 bins with one distinct zipcode value in each bin.
...
Distribution for virginia.customer.zipcode
Constructed on 09/18/2000
( 02135 )
1: ( 1, 1, 02135 )
2: ( 1, 1, 08002 )
3: ( 1, 1, 08540 )
4: ( 1, 1, 19898 )
5: ( 1, 1, 32256 )
6: ( 1, 1, 60406 )
7: ( 1, 1, 74006 )
8: ( 1, 1, 80219 )
9: ( 1, 1, 85008 )
10: ( 1, 1, 85016 )
11: ( 1, 1, 94026 )
12: ( 1, 1, 94040 )
13: ( 1, 1, 94085 )
14: ( 1, 1, 94117 )
15: ( 1, 1, 94303 )
16: ( 1, 1, 94304 )
17: ( 1, 1, 94609 )
1: ( 2, 94022 )
2: ( 2, 94025 )
3: ( 2, 94062 )
4: ( 3, 94063 )
5: ( 2, 94086 )
The OVERFLOW portion of the output shows the duplicate values that might
skew the distribution data, so dbschema moves them from the distribution to a
For more information about the dbschema utility, see the IBM Informix Migration
Guide.
To improve the performance of a query, consider using some of the methods that
the following topics describe.
In addition:
v Consider using the CREATE INDEX ONLINE and DROP INDEX ONLINE
statements to create and drop an index in an online environment, when the
database and its associated tables are continuously available. These SQL
statements enable you to create and drop indexes without having an access lock
placed over the table during the duration of the index builds or drops. For more
information, see “Creating and dropping an index in an online environment” on
page 7-16.
v Set the BATCHEDREAD_INDEX configuration parameter to enable the
optimizer to automatically fetch a set of keys from an index buffer. This reduces
the number of times a buffer is read.
Related reference:
BATCHEDREAD_INDEX configuration parameter (Administrator's Reference)
If you perform the query occasionally, you can reasonably let the database server
build and discard an index.
The database server can use an index on columns a, b, and c (in that order) in the
following ways:
v To locate a particular row
The database server can use a composite index when the first filter is an equality
filter and subsequent columns have range (<, <=, >, >=) expressions. The
following examples of filters use the columns in a composite index:
Execution is most efficient when you create a composite index with the columns in
order from most to least distinct. In other words, the column that returns the
highest count of distinct rows when queried with the DISTINCT keyword in the
SELECT statement should come first in the composite index.
If your application performs several long queries, each of which contains ORDER
BY or GROUP BY clauses, you can sometimes improve performance by adding
indexes that produce these orderings without requiring a sort. For example, the
following query sorts each column in the ORDER BY clause in a different direction:
SELECT * FROM t1 ORDER BY a, b DESC;
To avoid using temporary tables to sort column a in ascending order and column b
in descending order, you must create a composite index on (a, b DESC) or on (a
DESC, b). You need to create only one of these indexes because of the
bidirectional-traverse capability of the database server. For more information about
bidirectional traverse, see the IBM Informix Guide to SQL: Syntax.
On the other hand, it can be less expensive to perform a table scan and sort the
results instead of using the composite index when the following criteria are met:
v Your table is well ordered relative to your index.
v The number of rows that the query retrieves represents a large percentage of the
available data.
The fact table is generally large and contains the quantitative or factual information
about the subject. A dimensional table describes an attribute in the fact table.
Consider the example of a star schema with one fact table named orders and four
dimensional tables named customers, suppliers, products, and clerks. The orders
table describes the details of each sale order, which includes the customer ID,
The following query finds the total direct sales revenue in the Menlo Park region
(postal code 94025) for hard drives supplied by the Johnson supplier:
SELECT sum(orders.price)
FROM orders, customers, suppliers,product,clerks
WHERE orders.custid = customers.custid
AND customers.zipcode = 94025
AND orders.suppid = suppliers.suppid
AND suppliers.name = ’Johnson’
AND orders.prodid = product.prodid
AND product.type = ’hard drive’
AND orders.clerkid = clerks.clerkid
AND clerks.dept = ’Direct Sales’
This query uses a typical star join, in which the fact table joins with all
dimensional tables on a foreign key. Each dimensional table has a selective table
filter.
An optimal plan for the star join is to perform a cartesian product on the four
dimensional tables and then join the result with the fact table. The following index
on the fact table allows the optimizer to choose the optimal query plan:
CREATE INDEX ON orders(custid,suppid,prodid,clerkid)
Without this index, the optimizer might choose to first join the fact table with a
single dimensional table and then join the result with the remaining dimensional
tables. The optimal plan provides better performance.
For more information about star schemas and snowflake schemas, see the IBM
Informix Database Design and Implementation Guide.
The B-tree scanner improves transaction processing for logged databases when
rows are deleted from a table with indexes. The B-tree scanner automatically
determines which index partitions will be cleaned, based on a priority list. B-tree
scanner threads remove deleted index entries and rebalance the index nodes. The
B-tree scanner automatically determines which index items are to be deleted.
The default setting for B-tree scanning provides the following type of scanning,
depending on your indexes:
v If the table has more than one attached index, the B-tree scanner uses the leaf
scan mode. Leaf scan mode is the only type of scanning possible with multiple
attached indexes.
Depending on your application and the order in which the system adds and
deletes keys from the index, the structure of an index can become inefficient.
The server treats a forest of trees index the same way it treats a B-tree index.
Therefore, in a logged database, you can control how the B-tree scanner threads
remove deletions from both forest of trees and B-tree indexes.
The following table summarizes the differences between the scan modes.
Table 13-3. Scan modes for B-tree scanner threads
Performance
Scan Mode Description Advantages or Issues More Information
Leaf scan mode In this mode, the leaf This mode is only “Leaf and range scan
level of an attached possible on attached mode settings” on
index is completely indexes and is the page 13-27
scanned for deleted only mode the server
items. can use if more than
one attached index
exists in a partition.
For more information about the BTSCANNER configuration parameter and for
more information about how the database server maintains an index tree, see the
chapter on configuration parameters and the chapter on disk structure and storage
in the IBM Informix Administrator's Reference.
Use the onmode -C option to change the configuration of B-tree scanners during
runtime.
For more information about onstat -C and onmode -C, see the IBM Informix
Administrator's Reference.
When you set alice mode, the higher the mode, the more memory is used per
index partition. However, the memory used is not a huge amount. The advantage
is less I/O, as shown in the following table.
Table 13-4. Alice mode settings
Alice Mode Setting Memory or Block I/O
0 Turns off alice scanning.
1 Uses exactly 8 bytes of memory (no adjusting).
When you set the alice mode, you need to consider memory usage versus I/O. The
lower the alice mode setting, the less memory the index will use. The higher the
alice mode setting, the more memory the index will use. 12 is the highest mode
value, because it is a direct mapping of a single bit of memory to each instance of
I/O.
Suppose you have an online page size of 2 KB and the default B-Tree Scanner I/O
size of 256 pages. If you set the alice mode to 6, each byte of memory can
represent 131,072 index pages (256 MB). If you set the mode to 10, each byte of
memory can represent 8,192 index pages (16 MB). Thus, changing the mode setting
from 6 to 10 requests 16 times the memory, but requires 16 times less I/O.
If you have an index partition that uses 1 GB, then an alice mode setting of 6
would take 4 bytes of memory, while an alice mode setting of 10 would consume
64 bytes of memory, as shown in this formula:
( {mode block size} io per bit * 8 bits per byte * 256 page per io )
Setting the alice mode to a value between 3 and 12 sets the initial amount of
memory that is used for index cleaning. Subsequently, the B-tree scanners
automatically adjust the mode based on the efficiency of past cleaning operations.
For example, if after five scans (by default), the I/O efficiency is below 75 percent,
the server automatically adjusts to the next alice mode if you set the mode to a
value above 2. For example, if an index is currently operating in alice mode 6, a
B-tree scanner has cleaned the index at least 5 times, and the I/O efficiency is
below 75 percent, the server automatically adjusts to mode 7, the next higher
mode. This doubles the memory required, but reduces the I/O by a factor of 2.
The server will re-evaluate the index after five more scans to determine the I/O
efficiency again, and will continue to do this until mode 12. The server stops
making adjustments at mode 12.
If you decide to enable range scan mode when a single index exists in the
partition, set rangesize option of the BTSCANNER configuration parameter to the
minimum size that a partition must have to be scanned using this mode. Specify
the size in kilobytes.
B-tree scanner threads look for index pages that can be compressed because they
are below the specified level. The B-tree scanner can compress index pages with
deleted items and pages that do not have deleted items.
By default, a B-tree scanner compresses at the medium level. The following table
provides information about the performance benefits and trade-offs if you change
the compression level to high or low.
Table 13-5. B-Tree Scanner Compression Level Benefits and Trade-offs
Compression
Level Performance Benefits and Trade-offs When to Use
Low The low compression level is You might want to change the
beneficial for an index that is compression level to low if you
expected to grow quickly, with expect an index to grow quickly with
frequent B-tree node splits. When the frequent splits.
compression level is set to low, the
B-tree index will not require as many
splits as indexes with medium or
high compression levels, because
more free space remains in the B-tree
nodes.
If you do not need to change the compression level to high or low, set the
compression option of the BTSCANNER configuration parameter to med or
default.
In addition to the compression option that specifies when to attempt to join two
partially used pages, you can use the FILL FACTOR configuration parameter to
control when to add new index pages. The index fill factor, which you define with
the FILLFACTOR configuration parameter or the FILLFACTOR option of the
CREATE INDEX statement, is a percentage of each index page that will be filled
during the index build.
Prerequisites:
v Determine if adjusting the level for index compression will improve
performance.
v Get statistics on the number of rows read, deleted, and inserted by running the
onstat -g ppf command. You can also view information in the sysptprof table.
v Analyze the statistics to determine if you want to change the threshold.
For information about compression levels and the circumstances under which you
might want to change the level, see “B-tree scanner index compression levels and
transaction processing performance” on page 13-27.
Specify the compression level for an instance with any of the following options:
v Set the compression field of the BTSCANNER configuration parameter to low,
med (medium), high, or default. (The system default value is med.)
Examples
Run either of the following SQL administration API functions to set the
compression level for a single fragment of the index that has the partition number
1048960:
EXECUTE FUNCTION TASK("SET INDEX COMPRESSION", 1048960, "DEFAULT");
EXECUTE FUNCTION ADMIN("SET INDEX COMPRESSION", 1048960, "LOW");
Run the following SELECT statement to execute the task function over all index
fragments. This command sets the compression level for all fragments of an index
named idx1 in a database named db1.
SELECT sysadmin:TASK("SET INDEX COMPRESSION", partnum, "MED")
FROM sysmaster:systabnames
WHERE dbsname = ’dbs1’ AND tabname = ’idx1’;
You can also run the following SELECT TASK statement to execute the task
function over all index fragments and set the compression level for all fragments.
SELECT TASK("SET INDEX COMPRESSION", partn, "MED")
FROM dbs1:systables t, dbs1:sysfragments f
WHERE f.tabid = t.tabid AND f.fragtype = ’I’ AND indexname =’idx1’;
If your table has relatively low update activity and a large amount of free space
exists, you might want to drop and re-create the index with a larger value for
FILLFACTOR to make the unused disk space available.
For an example of this higher estimated cost, see “The query plan of a distributed
query” on page 13-30.
The server uses the following factors to determine the buffer size:
v The row size
The database server calculates the row size by summing the average move size
(if available) or the length (from the syscolumns system catalog table) of the
columns.
v The setting of the FET_BUF_SIZE environment variable on the client
You might be able to reduce the size and number of data transfers by using the
FET_BUF_SIZE environment variable to increase the size of the buffer that the
database server uses to send and receive rows to and from the remote database
server.
The minimum buffer size is 1024 or 2048 bytes, depending on the row size. If
the row size is larger than either 1024 or 2048 bytes, the database server uses the
FET_BUF_SIZE value.
For more information about the FET_BUF_SIZE environment variable, see the
IBM Informix Guide to SQL: Reference.
The following figure shows the chosen query plan for the distributed query.
QUERY:
------
select l.customer_num, l.lname, l.company,
l.phone, r.call_dtime, r.call_descr
from customer l, vjp_stores@gilroy:cust_calls r
where l.customer_num = r.customer_num
Estimated Cost: 9
Estimated # of Rows Returned: 7
Figure 13-5. Selected Output of SET EXPLAIN ALL for Distributed Query, Part 3
The following table shows the main differences between the chosen query plans for
the distributed join and the local join.
Output Line in Figure 13-5 for Output Line in Figure 11-1 on page
Distributed Query 11-9 for Local-Only Query Description of Difference
vjp_stores@gilroy: virginia.cust_calls informix.cust_calls The remote table name is prefaced
with the database and server
names.
Sequential access to a table other than the first table in the plan is ominous because
it threatens to read every row of the table once for every row selected from the
preceding tables.
If the table is small, it is harmless to read it repeatedly because the table resides
completely in memory. Sequential search of an in-memory table can be faster than
searching the same table through an index, especially if maintaining those index
pages in memory pushes other useful pages out of the buffers.
When the table is larger than a few pages, however, repeated sequential access
produces poor performance. One way to prevent this problem is to provide an
index to the column that is used to join the table.
Any user with the Resource privilege can build additional indexes. Use the
CREATE INDEX statement to make an index.
An index consumes disk space proportional to the width of the key values and the
number of rows. (See “Estimating index pages” on page 7-4.) Also, the database
server must update the index whenever rows are inserted, deleted, or updated; the
index update slows these operations. If necessary, you can use the DROP INDEX
statement to release the index after a series of queries, which frees space and
makes table updates easier.
When view folding is enabled, views are folded into a parent query. Because the
views are folded into the parent query, the query results are not placed in a
temporary table.
View folding does not occur for the following types of queries that perform a
UNION ALL operation involving a view:
v The view has one of the following clauses: AGGREGATE, GROUP BY, ORDER
BY, UNION, DISTINCT, or OUTER JOIN (either Informix or ANSI type).
v The parent query has a UNION or UNION ALL clause.
In these situations, a temporary table is created to hold query results.
The following suggestions can help you rewrite your query more efficiently:
v Avoid or simplify sort operations.
v Use parallel sorts.
v Use temporary tables to reduce sorting scope.
The sort algorithm is highly tuned and extremely efficient. It is as fast as any
external sort program that you might apply to the same data. You do not need to
avoid infrequent sorts or sorts of relatively small numbers of output rows.
However, you should try to avoid or reduce the scope of repeated sorts of large
tables. The optimizer avoids a sort step whenever it can use an index to produce
the output in its proper order automatically. The following factors prevent the
optimizer from using an index:
v One or more of the ordered columns is not included in the index.
v The columns are named in a different sequence in the index and the ORDER BY
or GROUP BY clause.
v The ordered columns are taken from different tables.
For another way to avoid sorts, see “Use temporary tables to reduce sorting scope”
on page 13-33.
If a sort is necessary, look for ways to simplify it. As discussed in “Sort-time costs”
on page 10-24, the sort is quicker if you can sort on fewer or narrower columns.
Related concepts:
“Ordering with fragmented indexes” on page 13-37
When PDQ priority is greater than 0 and PSORT_NPROCS is greater than 1, the
query benefits both from parallel sorts and from PDQ features such as parallel
scans and additional memory. Users can use the PDQPRIORITY environment
variable to request a specific proportion of PDQ resources for a query. You can use
the MAX_PDQPRIORITY configuration parameter to limit the number of such user
requests. For more information, see “Limiting PDQ resources in queries” on page
3-11.
In some cases, the amount of data being sorted can overflow the memory resources
allocated to the query, resulting in I/O to a dbspace or sort file. For more
information, see “Configure dbspaces for temporary tables and sort files” on page
5-8.
This query reads the entire cust table. For every row with the specified postal
code, the database server searches the index on rcvbles.customer_id and performs
a nonsequential disk access for every match. The rows are written to a temporary
file and sorted. For more information about temporary files, see “Configure
dbspaces for temporary tables and sort files” on page 5-8.
This procedure is acceptable if the query is performed only once, but this example
includes a series of queries, each incurring the same amount of work.
You can then execute queries against the temporary table, as the following example
shows:
SELECT *
FROM cust_with_balance
WHERE postcode LIKE ’98_ _ _’
ORDER BY cust.name
How you configure the amount of memory that is available for a query depends
on whether or not the query is a Parallel Database Query (PDQ).
If the PDQ priority is set to 0 (zero), you can change the amount of memory that is
available for a query that is not a PDQ query by changing the setting of the
DS_NONPDQ_QUERY_MEM configuration parameter. You can only use this
parameter if the PDQ priority is set to zero. Its setting has no effect if the PDQ
priority is greater than zero.
For example, if you use the onmode utility, specify a value as shown in the
following example:
onmode -wf DS_NONPDQ_QUERY_MEM=500
If Informix changes the value that you set, the server sends a message in this
format:
For formulas for estimating the amount of additional space to allocate for hash
joins, see “Estimating temporary space for dbspaces and hash joins” on page 5-12.
The Memory Grant Manager (MGM) component of Informix coordinates the use of
memory, CPU virtual processors (VPs), disk I/O, and scan threads among
decision-support queries. The MGM uses the DS_MAX_QUERIES,
DS_TOTAL_MEMORY, DS_MAX_SCANS, and MAX_PDQPRIORITY configuration
parameter settings to determine the quantity of these PDQ resources that can be
Optimization level
You normally obtain optimum overall performance with the default optimization
level, HIGH. The time that it takes to optimize the statement is usually
unimportant. However, if experimentation with your application reveals that your
query is still taking too long, you can set the optimization level to LOW.
If you change the optimization level to LOW, check the SET EXPLAIN output to
see if the optimizer chose the same query plan as before.
To specify a HIGH or LOW level of database server optimization, use the SET
OPTIMIZATION statement.
Related reference:
SET OPTIMIZATION statement (SQL Syntax)
Optimization goals
Optimizing total query time and optimizing user-response time are two
optimization goals for improving query performance.
Total query time is the time it takes to return all rows to the application. Total
query time is most important for batch processing or for queries that require all
rows be processed before returning a result to the user, as in the following query:
SELECT count(*) FROM orders
WHERE order_amount > 2000;
User-response time is the time that it takes for the database server to return a
screen full of rows back to an interactive application. In interactive applications,
only a screen full of data can be requested at one time. For example, the user
application can display only 10 rows at one time for the following query:
SELECT * FROM orders
WHERE order_amount > 2000;
Which optimization goal is more important can have an effect on the query path
that the optimizer chooses. For example, the optimizer might choose a nested-loop
join instead of a hash join to execute a query if user-response time is most
important, even though a hash join might result in a reduction in total query time.
The default behavior is for the optimizer to choose query plans that optimize the
total query time. You can specify optimization of user-response time at several
different levels:
v For the database server system
For example, optimizer directives take precedence over the goal that the SET
OPTIMIZATION statement specifies.
The following sections explain some of the possible differences in query plans.
Hash joins generally have a higher cost to retrieve the first row than nested-loop
joins do. The database server must build the hash table before it retrieves any
rows. However, in some cases, total query time is faster if the database server uses
a hash join.
In the following example, tab2 has an index on col1, but tab1 does not have an
index on col1. When you execute SET OPTIMIZATION ALL_ROWS before you
run the query, the database server uses a hash join and ignores the existing index,
as the following portion of SET EXPLAIN output shows:
QUERY:
------
SELECT * FROM tab1,tab2
WHERE tab1.col1 = tab2.col1
Estimated Cost: 125
Estimated # of Rows Returned: 510
1) lsuto.tab2: SEQUENTIAL SCAN
2) lsuto.tab1: SEQUENTIAL SCAN
DYNAMIC HASH JOIN
Dynamic Hash Filters: lsuto.tab2.col1 = lsuto.tab1.col1
However, when you execute SET OPTIMIZATION FIRST_ROWS before you run
the query, the database server uses a nested-loop join. The clause (FIRST_ROWS
OPTIMIZATION) in the following partial SET EXPLAIN output shows that the
optimizer used user-response-time optimization for the query:
QUERY: (FIRST_ROWS OPTIMIZATION)
------
SELECT * FROM tab1,tab2
WHERE tab1.col1 = tab2.col1
Estimated Cost: 145
Estimated # of Rows Returned: 510
1) lsuto.tab1: SEQUENTIAL SCAN
2) lsuto.tab2: INDEX PATH
(1) Index Keys: col1
Lower Index Filter: lsuto.tab2.col1 = lsuto.tab1.col1
NESTED LOOP JOIN
In cases where the database server returns a large number of rows from a table,
the lower-cost option for the total-query-time goal might be to scan the table
instead of using an index. However, to retrieve the first row, the lower-cost option
for the user-response-time goal might be to use the index to access the table.
When an index is not fragmented, the database server can use the index to avoid a
sort. However, when an index is fragmented, the ordering can be guaranteed only
within the fragment, not between fragments.
Usually, the least expensive option for the total-query-time goal is to scan the
fragments in parallel and then use the parallel sort to produce the proper ordering.
However, this option does not favor the user-response-time goal.
Instead, if the user-response time is more important, the database server reads the
index fragments in parallel and merges the data from all of the fragments. No
additional sort is generally needed.
Related concepts:
In addition, programmers can write the following functions or UDRs to help the
optimizer create an efficient query plan for your queries:
v Parallel UDRs that can take advantage of parallel database queries
v User-defined selectivity functions that calculate the expected fraction of rows
that qualify for the function
v User-defined cost functions that calculate the expected relative cost to execute a
user-defined routine
v User-defined statistical functions that the UPDATE STATISTICS statement can
use to generate statistics and data distributions
v User-defined negator functions to allow more choices for the optimizer
Parallel UDRs
One way to execute UDRs is in an expression in a query. You can take advantage
of parallel execution if the UDR is in an expression in the query.
For parallel execution, the UDR must be in one of the following parts of a query:
v WHERE clause
v SELECT list
v GROUP by list
v Overloaded comparison operator
v User-defined aggregate
v HAVING clause
v Select list for a parallel insertion statement
v Generic B-tree index scan on multiple index fragments if the compare function
used in the B-tree index scan is parallelizable
By default, a UDR does not run in parallel. To enable parallel execution of UDRs,
you must take the following actions:
v Specify the PARALLELIZABLE modifier in the CREATE FUNCTION or ALTER
FUNCTION statement.
v Ensure that the UDR does not call functions that are not PDQ thread-safe.
v Turn on PDQ priority.
v Use the UDR in a parallel database query.
The following example shows how you can place a UDR in an SQL statement:
SELECT * FROM image
WHERE get_x1(image.im2) and get_x2(image.im1)
The optimizer cannot accurately evaluate the cost of executing a UDR without
additional information. You can provide the cost and selectivity of the function to
the optimizer. The database server uses cost and selectivity together to determine
the best path. For more information about selectivity, see “Filters with user-defined
routines” on page 13-2.
In the previous example, the optimizer cannot determine which function to execute
first, the get_x1 function or the get_x2 function. If a function is expensive to
execute, the DBA can assign the function a larger cost or selectivity, which can
influence the optimizer to change the query plan for better performance. In the
previous example, if get_x1 costs more to execute, the DBA can assign a higher
cost to the function, which can cause the optimizer to execute the get_x2 function
first.
You can add the following routine modifiers to the CREATE FUNCTION statement
to change the cost or selectivity that the optimizer assigns to the function:
v selfunc=function_name
v selconst=integer
For more information about cost or selectivity modifiers, see the IBM Informix
User-Defined Routines and Data Types Developer's Guide.
The database server runs the statistics collection function when you execute
UPDATE STATISTICS.
For more information about the importance of updating statistics, see “Statistics
held for the table and index” on page 10-20. For information about improving
performance, see “Updating statistics for columns with user-defined data types” on
page 13-16.
Negator functions
A negator function takes the same arguments as its companion function, in the same
order, but returns the Boolean complement. That is, if a function returns TRUE for a
given set of arguments, its negator function returns FALSE when passed the same
arguments, in the same order.
In certain cases, the database server can process a query more efficiently if the
sense of the query is reversed. That is, “Is x greater than y?” changes to “Is y less
than or equal to x?”
The database server can store the optimized SQL statement in the virtual portion
of shared memory, in an area that is called the SQL statement cache. The SQL
statement cache (SSC) can be accessed by all users, and it allows users to bypass
the optimize step before they run the query. This capability can result in the
following significant performance improvements:
v Reduced response times when users are running the same SQL statements.
SQL statements that take longer to optimize (usually because they include many
tables and many filters in the WHERE clause) run faster from the SQL statement
cache because the database server does not optimize the statement.
v Reduced memory usage because the database server shares query data
structures among users.
Memory reduction with the SQL statement cache is greater when a statement
has many column names in the select list.
For more information about the effect of the SQL statement cache on the
performance of the overall system, see “Monitor and tune the SQL statement
cache” on page 4-25.
This kind of application benefits from use of the SQL statement cache because
users are likely to find the SQL statements in the SQL statement cache.
The database server does not consider the following SQL statements exact matches
because they contain different literal values in the WHERE clause:
SELECT * FROM customer, orders
WHERE customer.customer_num = orders.customer_num
AND order_date > "01/01/07"
SELECT * FROM customer, orders
WHERE customer.customer_num = orders.customer_num
AND order_date > "01/01/2007"
Performance does not improve with the SQL statement cache in the following
situations:
v If a report application is run once nightly, and it executes SQL statements that
no other application uses, it does not benefit from use of the statement cache.
v If an application prepares a statement and then executes it many times,
performance does not improve with the SQL statement cache because the
statement is optimized just once during the PREPARE statement.
When a statement contains host variables, the database server replaces the host
variables with placeholders when it stores the statement in the SQL statement
cache. Therefore, the statement is optimized without the database server having
access to the values of the host variables. In some cases, if the database server had
access to the values of the host variables, the statement might be optimized
differently, usually because the distributions stored for a column inform the
optimizer exactly how many rows pass the filter.
If an SQL statement that contains host variables performs poorly with the SQL
statement cache turned on, try flushing the SQL statement cache with the onmode
-e flush command and running the query with values that are more frequently
used across multiple executions of the query. When you flush the cache, the
database server reoptimizes the query and generates a query plan that is optimized
for these frequently used values.
Important: The database server flushes an entry from the SQL statement cache
only if it is not in use. If an application prepares the statement and keeps it, the
entry is still in use. In this case, the application needs to close the statement before
the flush is beneficial.
To enable the SQL statement cache, set the STMT_CACHE configuration parameter
to a value that defines either of the following modes:
v Always use the SQL statement cache unless a user explicitly specifies do not use
the cache.
v Use the SQL statement cache only when a user explicitly specifies use it.
For more information, see “Enabling the SQL statement cache.” For more
information about the STMT_CACHE configuration parameter, see the IBM
Informix Administrator's Reference.
Use one of the following methods to change this STMT_CACHE default value:
v Update the ONCONFIG file to specify the STMT_CACHE configuration
parameter and restart the database server.
If you set the STMT_CACHE configuration parameter to 1, the database server
uses the SQL statement cache for an individual user when the user sets the
STMT_CACHE environment variable to 1 or executes the SET STATEMENT
CACHE ON statement within an application.
STMT_CACHE 1
If the STMT_CACHE configuration parameter is 2, the database server stores
SQL statements for all users in the SQL statement cache except when individual
users turn off the feature with the STMT_CACHE environment variable or the
SET STATEMENT CACHE OFF statement.
STMT_CACHE 2
v Use the onmode -e command to override the STMT_CACHE configuration
parameter dynamically.
If you use the enable keyword, the database server uses the SQL statement
cache for an individual user when the user sets the STMT_CACHE environment
variable to 1 or executes the SET STATEMENT CACHE ON statement within an
application.
onmode -e enable
If you use the on keyword, the database server stores SQL statements for all
users in the SQL statement cache except when individual users turn off the
feature with the STMT_CACHE environment variable or the SET STATEMENT
CACHE OFF statement.
onmode -e on
The following table summarizes the use of the SQL statement cache, which
depends on the setting of the STMT_CACHE configuration parameter (or the
execution of onmode -e) and the use in an application of the STMT_CACHE
environment variable and the SET STATEMENT CACHE statement.
For a complete list of the exceptions and a list of requirements for an exact match,
see SET STATEMENT CACHE in the IBM Informix Guide to SQL: Syntax.
You obtain memory information by identifying the SQL statements that use a large
amount of memory.
When the session shares the memory structures in the SSC, the value in the used
memory column should be lower than when the cache is turned off. For example,
Figure 13-6 on page 13-44 shows sample onstat -u output when the SQL statement
Figure 13-6. onstat -u Output when the SQL statement cache is not enabled
Figure 13-7. onstat -u Output when the SQL statement cache is enabled
Figure 13-7 also shows the memory allocated and used for Session 16, which
executes the same SQL statements as Session 4. Session 16 allocates less total
memory (40960) and uses less memory (38784) than Session 4 (Figure 13-6 shows
53248 and 45656, respectively) because it uses the existing memory structures in
the SQL statement cache.
The following onstat -g ses session-id output columns display memory usage:
v The Memory pools portion of the output
– The totalsize column shows the number of bytes currently allocated
– The freesize column shows the number of unallocated bytes
v The last line of the output shows the number of bytes allocated from the
sscpool.
Figure 13-8 on page 13-45 shows that Session 16 has currently allocated 69632
bytes, of which 11600 bytes are allocated from the sscpool.
...
Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers
14 SELECT vjp_stores CR Not Wait 0 0 9.03
The following figure shows that onstat -g sql session-id displays the same
information as the bottom portion of the onstat -g ses session-id command in
Figure 13-8, which includes the number of bytes allocated from the sscpool.
The following figure displays the output of onstat -g stm session-id for the same
session (14) as in onstat -g ses session-id in Figure 13-8 on page 13-45 and onstat -g
sql session-id in Figure 13-9.
When the SQL statement cache (SSC) is on, the database server creates the heaps in
the SSC pool. Therefore, the heapsz output field in Figure 13-10 shows that this
SQL statement uses 10056 bytes, which is contained within the 11600 bytes in the
SSC pool that the onstat -g sql 14 shows.
onstat -g stm 14
session 14 ---------------------------------------------------------------
sdblock heapsz statement (’*’ = Open cursor)
aa11018 10056 *SELECT C.customer_num, O.order_num
FROM customer C, orders O, items I
WHERE C.customer_num = O.customer_num
AND O.order_num = I.order_num
The database server drops an entry from the cache when one of the objects that the
query depends on is altered so that it invalidates the data dictionary cache entry
for the query. The following operations cause a dependency check failure:
When an entry is marked as dropped or deleted, the database server must reparse
and reoptimize the SQL statement the next time it executes. For example,
Figure 13-11 shows the entries that the onstat -g ssc command displays after
UPDATE STATISTICS was executed on the items and orders table between the
execution of the first and second SQL statements.
The Statement Cache Entries portion of the onstat -g ssc output in Figure 13-11
displays a flag field that indicates whether or not an entry has been dropped or
deleted from the SQL statement cache.
v The first entry has a flag column with the value DF, which indicates that the
entry is fully cached, but is now dropped because its entry was invalidated.
v The second entry has the same statement text as the third entry, which indicates
that it was reparsed and reoptimized when it was executed after the UPDATE
STATISTICS statement.
onstat -g ssc
...
Statement Cache Entries:
Figure 13-11. Sample onstat -g ssc command output for a dropped entry
Some of the information that you can monitor for sessions and threads allows you
to determine if an application is using a disproportionate amount of the resources.
Use the following onstat utility commands to monitor sessions and threads:
v onstat -u
v onstat -g ath
v onstat -a act
v onstat -a cpu
v onstat -a ses
v onstat -g mem
v onstat -g stm
Active threads include threads that belong to user sessions, as well as some that
correspond to database server daemons (for example, page cleaners). Figure 13-12
on page 13-49 shows an example of onstat -u output.
Also use the onstat -u command to determine if a user is waiting for a resource or
holding too many locks, or to get an idea of how much I/O the user has
performed.
If you execute onstat -u while the database server is performing fast recovery,
several database server threads might appear in the display.
Related reference:
onstat -u command: Print user activity profile (Administrator's Reference)
The onstat -g ath command display does not include the session ID (because not
all threads belong to sessions).
The status field contains information on the status of thread, such as running,
cond wait, IO Idle, IO Idle, sleeping secs: number_of_seconds, or sleeping
forever. The following output example identifies many threads as sleeping
forever. To improve performance, you can remove or reduce the number of
threads that are identified as sleeping forever.
Threads that a primary decision-support thread started have a name that indicates
their role in the decision-support query. The following figure shows four scan
threads that belong to a decision-support thread.
Threads:
tid tcb rstcb prty status vp-class name
11 994060 0 4 sleeping(Forever) 1cpu kaio
12 994394 80f2a4 2 sleeping(secs: 51) 1cpu btclean
26 99b11c 80f630 4 ready 1cpu onmode_mon
32 a9a294 812b64 2 ready 1cpu sqlexec
113 b72a7c 810b78 2 ready 1cpu sqlexec
114 b86c8c 81244c 2 cond wait(netnorm) 1cpu sqlexec
115 b98a7c 812ef0 2 cond wait(netnorm) 1cpu sqlexec
116 bb4a24 80fd48 2 cond wait(netnorm) 1cpu sqlexec
117 bc6a24 81161c 2 cond wait(netnorm) 1cpu sqlexec
118 bd8a24 811290 2 ready 1cpu sqlexec
119 beae88 810f04 2 cond wait(await_MC1) 1cpu scan_1.0
120 a8ab48 8127d8 2 ready 1cpu scan_2.0
121 a96850 810460 2 ready 1cpu scan_2.1
122 ab6f30 8119a8 2 running 1cpu scan_2.2
Figure 13-14. onstat -g ath output showing scan threads belonging to a decision-support
thread
Related concepts:
“Improve connection performance and scalability” on page 3-16
Related reference:
onstat -g ath command: Print information about all threads (Administrator's
Reference)
The following output example shows the ID and name of each thread that is
running, the ID of the virtual processor in which each thread is running, the day
and time when each thread last ran, how much CPU time each thread used, the
number of times each thread was scheduled to run, and the status of each thread.
Related reference:
onstat -g cpu: Print runtime statistics (Administrator's Reference)
For example, in Figure 13-16 on page 13-52, session number 49 is running five
threads for a decision-support query.
Related reference:
onstat -g ses command: Print session-related information (Administrator's
Reference)
You can determine which session to focus on by the used memory column of the
onstat -g ses output.
Figure 13-17 on page 13-53 shows sample onstat -g ses output and some of the
onstat -g mem and onstat -g stm output for Session 16.
v The output of the onstat -g mem command shows the total amount of memory
used by each session.
The totalsize column of the onstat -g mem 16 output shows the total amount of
memory allocated to the session.
v The output of the onstat -g stm command shows the portion of the total
memory allocated to the current prepared SQL statement.
The heapsz column of the onstat -g stm 16 output in the following figure shows
the amount of memory allocated for the current prepared SQL statement.
onstat -g mem 16
Pool Summary:
name class addr totalsize freesize #allocfrag #freefrag
16 V a9ea020 90112 10608 159 5
...
onstat -g stm 16
session 16 ---------------------------------------------------------------
sdblock heapsz statement (’*’ = Open cursor)
aa0d018 10056 *SELECT C.customer_num, O.order_num
FROM customer C, orders O, items I
WHERE C.customer_num = O.customer_num
AND O.order_num = I.order_num
Figure 13-17. onstat -g mem and onstat -g stm to determine session memory
Related reference:
onstat -g lap command: Print light appends status information (Administrator's
Reference)
onstat -g mem command: Print pool memory statistics (Administrator's
Reference)
Choose User from the Status menu. The following information appears:
v The session ID
v The user ID
v The number of locks that the thread is holding
v The number of read calls and write calls that the thread has executed
v Flags that indicate the present state of the thread (for example, waiting for a
buffer or waiting for a checkpoint), whether the thread is the primary thread for
a session, and what type of thread it is (for example, user thread, daemon
thread, and so on)
0 informix 0 96 2 ------D
0 informix 0 0 0 ------F
0 informix 0 0 0 -------
15 informix 0 0 0 Y-----M
0 informix 0 0 0 ------D
17 chrisw 1 3 34 Y------
Figure 13-18. Output from the User option of the ON-Monitor Status menu
In addition, some columns contain flags that show the following information;
v Whether the primary thread of the session is waiting for a latch, lock, log buffer,
or transaction
v If the thread is in a critical section.
Query the syssesprof table to obtain a profile of the activity of a session. This table
contains a row for each session with columns that store statistics on session activity
(for example, number of locks held, number of row writes, number of commits,
number of deletes, and so on).
ISA uses information that the following onstat command-line options generate to
display session information, as the following table shows. Click the Refresh button
to rerun the onstat command and display fresh information.
The onstat -x output contains the following information for each open transaction:
v The address of the transaction structure in shared memory
v Flags that indicate the following information:
– The present state of the transaction (user thread attached, suspended, waiting
for a rollback)
– The mode in which the transaction is running (loosely coupled or tight
coupled)
– The stage that the transaction is in (BEGIN WORK, prepared to commit,
committing or committed, rolling back)
– The nature of the transaction (global transaction, coordinator, subordinate,
both coordinator and subordinate)
v The thread that owns the transaction
v The number of locks that the transaction holds
v The logical-log file in which the BEGIN WORK record was logged
v The current logical-log id and position
v The isolation level
v The number of attempts to start a recovery thread
v The coordinator for the transaction (if the subordinate is executing the
transaction)
v The maximum number of concurrent transactions since you last started the
database server
Figure 13-19 shows sample output from onstat -x. The last transaction listed is a
global transaction, as the G value in the fifth position of the flags column indicates.
The T value in the second position of the flags column indicates that the
transaction is running in tightly coupled mode.
Transactions
address flags userthread locks beginlg curlog logposit isol retrys coord
ca0a018 A---- c9da018 0 0 5 0x18484c COMMIT 0
ca0a1e4 A---- c9da614 0 0 0 0x0 COMMIT 0
ca0a3b0 A---- c9dac10 0 0 0 0x0 COMMIT 0
ca0a57c A---- c9db20c 0 0 0 0x0 COMMIT 0
ca0a748 A---- c9db808 0 0 0 0x0 COMMIT 0
ca0a914 A---- c9dbe04 0 0 0 0x0 COMMIT 0
ca0aae0 A---- c9dcff8 1 0 0 0x0 COMMIT 0
ca0acac A---- c9dc9fc 1 0 0 0x0 COMMIT 0
ca0ae78 A---- c9dc400 1 0 0 0x0 COMMIT 0
ca0b044 AT--G c9dc9fc 0 0 0 0x0 COMMIT 0
10 active, 128 total, 10 maximum concurrent
The output in Figure 13-19 shows that this transaction branch is holding 13 locks.
When a transaction runs in tightly coupled mode, the branches of this transaction
share locks.
To find the relevant locks, match the address in the userthread column in onstat -x
output to the address in the owner column of onstat -k output.
onstat -x
Transactions
address flags userthread locks beginlg curlog logposit isol retrys coord
a366018 A---- a334018 0 0 1 0x22b048 COMMIT 0
a3661f8 A---- a334638 0 0 0 0x0 COMMIT 0
a3663d8 A---- a334c58 0 0 0 0x0 COMMIT 0
a3665b8 A---- a335278 0 0 0 0x0 COMMIT 0
a366798 A---- a335898 2 0 0 0x0 COMMIT 0
a366d38 A---- a336af8 0 0 0 0x0 COMMIT 0
6 active, 128 total, 9 maximum concurrent
onstat -k
Locks
address wtlist owner lklist type tblsnum rowid key#/bsiz
a09185c 0 a335898 0 HDR+S 100002 20a 0
a0918b0 0 a335898 a09185c HDR+S 100002 204 0
2 active, 2000 total, 2048 hash buckets, 0 lock table overflows
In the example in Figure 13-20, a user is selecting a row from two tables. The user
holds the following locks:
v A shared lock on one database
v A shared lock on another database
You can find the session-id of the transaction by matching the address in the
userthread column of the onstat -x output with the address column in the onstat
-u output. The sessid column of the same line in the onstat -u output provides the
session id.
For example, Figure 13-21 on page 13-58 shows the address a335898 in the
userthread column of the onstat -x output. The output line in onstat -u with the
same address shows the session id 15 in the sessid column.
Transactions
address flags userthread locks beginlg curlog logposit isol retrys coord
a366018 A---- a334018 0 0 1 0x22b048 COMMIT 0
a3661f8 A---- a334638 0 0 0 0x0 COMMIT 0
a3663d8 A---- a334c58 0 0 0 0x0 COMMIT 0
a3665b8 A---- a335278 0 0 0 0x0 COMMIT 0
a366798 A---- a335898 2 0 0 0x0 COMMIT 0
a366d38 A---- a336af8 0 0 0 0x0 COMMIT 0
6 active, 128 total, 9 maximum concurrent
onstat -u
address flags sessid user tty wait tout locks nreads nwrites
a334018 ---P--D 1 informix - 0 0 0 20 6
a334638 ---P--F 0 informix - 0 0 0 0 1
a334c58 ---P--- 5 informix - 0 0 0 0 0
a335278 ---P--B 6 informix - 0 0 0 0 0
a335898 Y--P--- 15 informix 1 a843d70 0 2 64 0
a336af8 ---P--D 11 informix - 0 0 0 0 0
6 active, 128 total, 17 maximum concurrent
For a transaction executing in loosely coupled mode, the first position of the flags
column in theonstat -u output might display a value of T. This T value indicates
that one branch within a global transaction is waiting for another branch to
complete. This situation could occur if two different branches in a global
transaction, both using the same database, tried to work on the same global
transaction simultaneously.
For a transaction executing in tightly coupled mode, this T value does not occur
because the database server shares one transaction structure for all branches that
access the same database in the global transaction. Only one branch is attached
and active at one time and does not wait for locks because the transaction owns all
the locks held by the different branches.
To obtain information about the last SQL statement that each session executed,
issue the onstat -g sql command with the appropriate session ID.
Figure 13-22 on page 14-1 shows sample output for this option using the same
session ID obtained from the onstat -u sample in Figure 13-21.
The onperf utility provides the following advantages over the onstat utility:
v Displays metric values graphically in real time
v Allows you to choose which metrics to monitor
v Allows you to scroll back to previous metric values to analyze a trend
v Allows you to save performance data to a file for review at a later time
You cannot use the onperf utility on High-Availability Data Replication (HDR)
secondary servers, remote standalone (RS) secondary servers, or shared disk (SD)
secondary servers.
An onperf tool is a Motif window that an onperf process manages, as Figure 14-1
shows.
Shared memory
onperf tool
Data-collector process onperf process
onperf onperf
Figure 14-1. Data flow from shared memory to an onperf tool window
The onperf utility allows designated metrics to be continually buffered. The data
collector writes these metrics to a circular buffer called the data-collector buffer.
When the buffer becomes full, the oldest values are overwritten as the data
collector continues to add data. The current contents of the data-collector buffer are
saved to a history file, as Figure 14-2 illustrates.
1 onperf
2 onperf
The onperf utility uses either a binary format or an ASCII representation for data
in the history file. The binary format is host-dependent and allows data to be
written quickly. The ASCII format is portable across platforms.
You have control over the set of metrics stored in the data-collector buffer and the
number of samples. You could buffer all metrics; however, this action might
consume more memory than is feasible. A single metric measurement requires 8
bytes of memory. For example, if the sampling frequency is one sample per second,
then to buffer 200 metrics for 3,600 samples requires approximately 5.5 megabytes
of memory. If this process represents too much memory, you must reduce the
depth of the data-collector buffer, the sampling frequency, or the number of
buffered metrics.
onperf tool
History file Playback process onperf process
onperf onperf
Figure 14-3. Flow of data from a history file to an onperf tool window
When the database server is installed and running in online mode, you can bring
up onperf tools either on the computer that is running the database server or on a
remote computer or terminal that can communicate with your database server
instance. Figure 14-4 illustrates both possibilities. In either case, the computer that
is running the onperf tools must support the X terminal and the mwm window
manager.
onperf tool
onperf process
Database
server onperf
UNIX platform running database server Client platform running X and mwm
onperf tool
onperf process
Database
server onperf
Set the LD_LIBRARY_PATH environment variable to the appropriate value for the
Motif libraries on the computer that is running onperf.
You can monitor multiple database server instances from the same Motif client by
invoking onperf for each database server, as the following example shows:
INFORMIXSERVER=instance1 ; export INFORMIXSERVER; onperf
INFORMIXSERVER=instance2 ; export INFORMIXSERVER; onperf
...
To exit from the onperf utility, use the Close option to close each tool window, use
the Exit option of a tool, or choose Window Manager > Close.
The graph-tool windows have no hierarchy; you can create and close these
windows in any order.
Graph tool
The graph tool is the principal onperf interface. Use the graph tool to display any
set of database server metrics that the onperf data collector obtains from shared
memory.
The Figure 14-5 shows a diagram of a graph tool that displays a graph of metrics
for ISAM calls.
100
80
60
40
20
0
09:55:30 09:56:00 09:56:30 09:57:00
You cannot bring up a graph-tool window from a query-tree tool, a status tool, or
one of the activity tools.
If the configuration of an initial graph-tool has not yet been saved or loaded from
disk, onperf does not display the name of a configuration file in the title bar.
If you open a historical data file, for example named caselog.23April.2PM, in this
graph-tool window, the title bar displays caselog.23.April.23.April.2PM.
The metric class is the generic database server component or activity that the metric
monitors. The metric scope depends on the metric class. In some cases, the metric
scope indicates a particular component or activity. In other cases, the scope
indicates all activities of a given type across an instance of the database server.
The Metrics menu has a separate option for each class of metrics. For more
information about metrics, see “Why you might want to use onperf” on page
14-12.
When you choose a class, such as Server, you see a dialog box like the one in
Figure 14-6.
The Select Metrics dialog box contains three list boxes. The list box on the left
displays the valid scope levels for the selected metrics class. For example, when
the scope is set to Server, the list box displays the dbservername of the database
server instance that is being monitored. When you select a scope from this list,
onperf displays the individual metrics that are available within that scope in the
middle list box. You can select one or more individual metrics from this list and
add them to the display by clicking Add. To remove them from the display, click
Remove.
Tip: You can display metrics from more than one class in a single graph-tool
window. For example, you might first select ISAM Calls, Opens, and Starts from
the Server class. When you choose the Option menu in the same dialog box, you
can select another metric class without exiting the dialog box. For example, you
might select the Chunks metric class and add the Operations, Reads, and Writes
metrics to the display.
The Filter button in the dialog box brings up an additional dialog box in which
you can filter long text strings shown in the Metrics dialog box. The Filter dialog
box also lets you select tables or fragments for which metrics are not currently
displayed.
If the specified configuration file does not exist, onperf prompts for one.
Save Configuration
Saves the current configuration to a file. If no configuration file is currently
specified, onperf prompts for one.
Configuration
Remove
The Configuration dialog box provides the following options for configuring
display.
Option Use
History Buffer Configuration
Allows you to select a metric class and metric scope to include in the
data-collector buffer. The data collector gathers information about all
metrics that belong to the indicated class and scope.
Graph Display Options
Allows you to adjust the size of the graph portion that scrolls off to the left
when the display reaches the right edge, the initial time interval that the
graph is to span, and the frequency with which the display is updated.
Data Collector Options
Controls the collection of data. The sample interval indicates the amount of
time to wait between recorded samples. The history depth indicates the
number of samples to retain in the data-collector buffer. The save mode
indicates the data-collector data should be saved in binary or ASCII
format.
The time interval to which you can scroll back is the lesser of the following
intervals:
v The time interval over which the metric has been displayed
v The history interval that the graph-tool Configuration dialog box specifies
The length of time you can scroll back through cannot exceed the depth of the
data-collector buffer.
For more information, see “The graph-tool Configure menu and the
Configuration dialog box” on page 14-8.
Figure 14-8 on page 14-11 illustrates the maximum scrollable intervals for metrics
that span different time periods.
Buffer depth
Metric 1
Metric 2
Figure 14-8. Maximum scrollable intervals for metrics that span different time periods
Query-tree tool
The query-tree tool contains options for monitoring the performance of individual
queries.
The query-tree tool is a separate executable tool that does not use the data-collector
process. You cannot save query-tree tool data to a file.
This tool includes a Select Session button and a Quit button. When you select a
session that is running a query, the large detail window displays the SQL operators
that constitute the execution plan for the query. The query-tree tool represents each
SQL operator with a box. Each box includes a dial that indicates rows per second
and a number that indicates input rows. In some cases, not all the SQL operators
can be represented in the detail window. The smaller window shows the SQL
operators as small icons.
The Quit button allows you to exit from the query-tree tool.
Status tool
The status tool enables you to select metrics to store in the data-collector buffer. In
addition, you can use this tool to save the data currently held in the data-collector
buffer to a file.
Status Tool
Activity tools
Activity tools are specialized forms of the graph tool that display instances of the
specific activity, based on a ranking of the activity by some suitable metric.
The activity tools use the bar-graph format. You cannot change the scale of an
activity tool manually; onperf always sets this value automatically.
The Graph menu provides you with options for closing, printing, and exiting the
activity tool.
The onperf utility allows you to scroll back over a time interval, as explained in
“Displaying recent-history values” on page 14-10.
For example, if you detect a degradation in database server response time, it might
not be obvious from looking at the current metrics which value is responsible for
the slowdown. The performance degradation might also be sufficiently gradual
that you cannot detect a change by observing the recent history of metric values.
To allow for comparisons over longer intervals, onperf allows you to save metric
values to a file, as explained in “Status tool” on page 14-11.
The following sections describe these metric classes. Each section indicates the
scope levels available and describes the metrics within each class.
The approach taken here is to describe each metric without speculating on what
specific performance problems it might indicate. Through experimentation, you can
determine which metrics best monitor performance for a specific database server
instance.
Disk-chunk metrics
The onperf utility can display metrics for a specific disk chunk.
The disk-chunk metrics take the path name of a chunk as the metric scope.
Disk-spindle metrics
The onperf utility can display metrics for a disk spindle.
The disk-spindle metrics take the path name of a disk device or operation-system
file as the metric scope.
Physical-processor metrics
The onperf utility can display CPU metrics.
Virtual-processor metrics
The onperf utility can display metrics for a virtual-processor class.
These metrics take a virtual-processor class as a metric scope (cpu, aio, kaio, and
so on). Each metric value represents a sum across all instances of this
virtual-processor class.
Session metrics
The onperf utility can display metrics for an active session.
Tblspace metrics
The onperf utility can display metrics for a particular tblspace.
A tblspace name is composed of the database name, a colon, and the table name
(database:table).
For fragmented tables, the tblspace represents the sum of all fragments in a table.
To obtain measurements for an individual fragment in a fragmented table, use the
Fragment Metric class.
These metrics take the dbspace of an individual table fragment as the metric scope.
The following case study illustrates a situation in which the disks are overloaded.
This study shows the steps taken to isolate the symptoms and identify the problem
based on an initial report from a user, and it describes the needed correction.
A database application that does not have the wanted throughput is being
examined to see how performance can be improved. The operating-system
monitoring tools reveal that a high proportion of process time was spent idle,
waiting for I/O. The database server administrator increases the number of CPU
VPs to make more processors available to handle concurrent I/O. However,
throughput does not increase, which indicates that one or more disks are
overloaded.
To verify the I/O bottleneck, the database server administrator must identify the
overloaded disks and the dbspaces that reside on those disks.
To identify overloaded disks and the dbspaces that reside on those disks:
1. To check the asynchronous I/O (AIO) queues, use onstat -g ioq. Figure A-1
shows the output.
In Figure A-1, the maxlen and totalops columns show significant results:
v The maxlen column shows the largest backlog of I/O requests to accumulate
within the queue. The last three queues are much longer than any other
queue in this column listing.
gfd pathname bytes read page reads bytes write page writes io/s
3 /dev/infx5 85456896 41727 207394816 101267 572.9
op type count avg. time
seeks 0 N/A
reads 13975 0.0015
writes 51815 0.0018
kaio_reads 0 N/A
kaio_writes 0 N/A
Depending on how your chunks are arranged, several queues can be associated
with the same device.
3. To determine the dbspaces that account for the I/O load, use onstat -d, as
Figure A-3 shows.
Dbspaces
address number flags fchunk nchunks flags owner name
c009ad00 1 1 1 1 N informix rootdbs
c009ad44 2 2001 2 1 N T informix tmp1dbs
c009ad88 3 1 3 1 N informix oltpdbs
c009adcc 4 1 4 1 N informix histdbs
c009ae10 5 2001 5 1 N T informix tmp2dbs
c009ae54 6 1 6 1 N informix physdbs
c009ae98 7 1 7 1 N informix logidbs
c009aedc 8 1 8 1 N informix runsdbs
c009af20 9 1 9 3 N informix acctdbs
9 active, 32 total
Chunks
address chk/dbs offset size free bpages flags pathname
c0099574 1 1 500000 10000 9100 PO- /dev/infx2
c009960c 2 2 510000 10000 9947 PO- /dev/infx2
c00996a4 3 3 520000 10000 9472 PO- /dev/infx2
c009973c 4 4 530000 250000 242492 PO- /dev/infx2
c00997d4 5 5 500000 10000 9947 PO- /dev/infx4
c009986c 6 6 510000 10000 2792 PO- /dev/infx4
c0099904 7 7 520000 25000 11992 PO- /dev/infx4
c009999c 8 8 545000 10000 9536 PO- /dev/infx4
c0099a34 9 9 250000 450000 4947 PO- /dev/infx5
c0099acc 10 9 250000 450000 4997 PO- /dev/infx6
c0099b64 11 9 250000 450000 169997 PO- /dev/infx7
11 active, 32 total
In the Chunks output, the pathname column indicates the disk device. The
chk/dbs column indicates the numbers of the chunk and dbspace that reside on
The Dbspaces output shows the name of the dbspace that is associated with each
dbspace number. In this case, all three of the overloaded disks are part of the
acctdbs dbspace.
Although the original disk configuration allocated three entire disks to the acctdbs
dbspace, the activity within this dbspace suggests that three disks are not enough.
Because the load is about equal across the three disks, it does not appear that the
tables are necessarily laid out badly or improperly fragmented. However, you
might get better performance by adding fragments on other disks to one or more
large tables in this dbspace or by moving some tables to other disks with lighter
loads.
Related reference:
onstat -g iof command: Print asynchronous I/O statistics (Administrator's
Reference)
onstat -g ioa command: Print combined onstat -g information (Administrator's
Reference)
onstat -g ioq command: Print I/O queue information (Administrator's
Reference)
onstat -g iov command: Print AIO VP statistics (Administrator's Reference)
onstat -d command: Print chunk information (Administrator's Reference)
Accessibility features
The following list includes the major accessibility features in IBM Informix
products. These features support:
v Keyboard-only operation.
v Interfaces that are commonly used by screen readers.
v The attachment of alternative input and output devices.
Keyboard navigation
This product uses standard Microsoft Windows navigation keys.
In dotted decimal format, each syntax element is written on a separate line. If two
or more syntax elements are always present together (or always absent together),
the elements can appear on the same line, because they can be considered as a
single compound syntax element.
Each line starts with a dotted decimal number; for example, 3 or 3.1 or 3.1.1. To
hear these numbers correctly, make sure that your screen reader is set to read
punctuation. All syntax elements that have the same dotted decimal number (for
example, all syntax elements that have the number 3.1) are mutually exclusive
alternatives. If you hear the lines 3.1 USERID and 3.1 SYSTEMID, your syntax can
include either USERID or SYSTEMID, but not both.
The dotted decimal numbering level denotes the level of nesting. For example, if a
syntax element with dotted decimal number 3 is followed by a series of syntax
elements with dotted decimal number 3.1, all the syntax elements numbered 3.1
are subordinate to the syntax element numbered 3.
The following words and symbols are used next to the dotted decimal numbers:
? Specifies an optional syntax element. A dotted decimal number followed
by the ? symbol indicates that all the syntax elements with a
corresponding dotted decimal number, and any subordinate syntax
elements, are optional. If there is only one syntax element with a dotted
decimal number, the ? symbol is displayed on the same line as the syntax
element (for example, 5? NOTIFY). If there is more than one syntax element
with a dotted decimal number, the ? symbol is displayed on a line by
itself, followed by the syntax elements that are optional. For example, if
you hear the lines 5 ?, 5 NOTIFY, and 5 UPDATE, you know that syntax
elements NOTIFY and UPDATE are optional; that is, you can choose one or
none of them. The ? symbol is equivalent to a bypass line in a railroad
diagram.
! Specifies a default syntax element. A dotted decimal number followed by
the ! symbol and a syntax element indicates that the syntax element is the
default option for all syntax elements that share the same dotted decimal
number. Only one of the syntax elements that share the same dotted
decimal number can specify a ! symbol. For example, if you hear the lines
2? FILE, 2.1! (KEEP), and 2.1 (DELETE), you know that (KEEP) is the
default option for the FILE keyword. In this example, if you include the
FILE keyword but do not specify an option, default option KEEP is applied.
A default option also applies to the next higher dotted decimal number. In
this example, if the FILE keyword is omitted, default FILE(KEEP) is used.
However, if you hear the lines 2? FILE, 2.1, 2.1.1! (KEEP), and 2.1.1
(DELETE), the default option KEEP only applies to the next higher dotted
decimal number, 2.1 (which does not have an associated keyword), and
does not apply to 2? FILE. Nothing is used if the keyword FILE is omitted.
* Specifies a syntax element that can be repeated zero or more times. A
dotted decimal number followed by the * symbol indicates that this syntax
element can be used zero or more times; that is, it is optional and can be
Notes:
1. If a dotted decimal number has an asterisk (*) next to it and there is
only one item with that dotted decimal number, you can repeat that
same item more than once.
2. If a dotted decimal number has an asterisk next to it and several items
have that dotted decimal number, you can use more than one item
from the list, but you cannot use the items more than once each. In the
previous example, you can write HOST STATE, but you cannot write HOST
HOST.
3. The * symbol is equivalent to a loop-back line in a railroad syntax
diagram.
+ Specifies a syntax element that must be included one or more times. A
dotted decimal number followed by the + symbol indicates that this syntax
element must be included one or more times. For example, if you hear the
line 6.1+ data-area, you must include at least one data area. If you hear
the lines 2+, 2 HOST, and 2 STATE, you know that you must include HOST,
STATE, or both. As for the * symbol, you can repeat a particular item if it is
the only item with that dotted decimal number. The + symbol, like the *
symbol, is equivalent to a loop-back line in a railroad syntax diagram.
IBM may not offer the products, services, or features discussed in this document in
other countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the user's responsibility to evaluate and verify the
operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not grant you
any license to these patents. You can send license inquiries, in writing, to:
For license inquiries regarding double-byte (DBCS) information, contact the IBM
Intellectual Property Department in your country or send inquiries, in writing, to:
The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law: INTERNATIONAL
BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS"
WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. Some states do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:
IBM Corporation
J46A/G4
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.
The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement or any equivalent agreement
between us.
All statements regarding IBM's future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.
All IBM prices shown are IBM's suggested retail prices, are current and are subject
to change without notice. Dealer prices may vary.
This information is for planning purposes only. The information herein is subject to
change before the products described become available.
This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
Each copy or any portion of these sample programs or any derivative work, must
include a copyright notice as follows:
© (your company name) (year). Portions of this code are derived from IBM Corp.
Sample Programs.
© Copyright IBM Corp. _enter the year or years_. All rights reserved.
If you are viewing this information softcopy, the photographs and color
illustrations may not appear.
This Software Offering does not use cookies or other technologies to collect
personally identifiable information.
If the configurations deployed for this Software Offering provide you as customer
the ability to collect personally identifiable information from end users via cookies
and other technologies, you should seek your own legal advice about any laws
applicable to such data collection, including any requirements for notice and
consent.
For more information about the use of various technologies, including cookies, for
these purposes, see IBM’s Privacy Policy at http://www.ibm.com/privacy and
IBM’s Online Privacy Statement at http://www.ibm.com/privacy/details the
section entitled “Cookies, Web Beacons and Other Technologies” and the “IBM
Software Products and Software-as-a-Service Privacy Statement” at
http://www.ibm.com/software/info/product-privacy.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of
International Business Machines Corp., registered in many jurisdictions worldwide.
Other product and service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the web at "Copyright and
trademark information" at http://www.ibm.com/legal/copytrade.shtml.
Notices C-3
Adobe, the Adobe logo, and PostScript are either registered trademarks or
trademarks of Adobe Systems Incorporated in the United States, and/or other
countries.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Oracle and/or its affiliates.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Index X-3
Configuration parameters (continued) Configuration parameters (continued)
INFORMIXOPCACHE 5-26 VP_MEMORY_CACHE_KB 3-24
LOCKBUFF 4-2 VPCLASS 3-4, 3-5, 3-6, 3-7, 3-8
LOCKS 4-2, 4-15, 8-12 CONNECT statement 5-2
LOGBUFF 4-15, 5-8, 5-20, 5-34 Connections
LOGFILES 5-32 CPU 3-25, 3-26
LOGSIZE 5-32, 5-35, 5-36 improving performance 3-16
LOW_MEMORY_RESERVE 4-15, 5-43 improving performance with MaxConnect 3-26
LTAPEBLK 5-42 multiplexed 3-25
LTAPEDEV 5-42 specifying number of 3-14
LTAPESIZE 5-42 type, ipcshm 3-1, 3-14
LTXEHWM 5-38 type, specifying 3-13, 3-14
LTXHWM 5-38 Constraints
MAX_FILL_DATA_PAGES 6-46 foreign-key 6-32
MAX_PDQPRIORITY 3-3, 3-11, 12-9, 12-11, 12-15, 13-33 referential 6-32
MIRROR 5-8 Contention
MIRROROFFSET 5-8 cost of reading a page 10-25
MIRRORPATH 5-8 reducing with fragmentation 9-3
MULTIPROCESSOR 3-10 Contiguous
NETTYPE 3-1, 3-14, 3-16, 3-18, 3-20, 4-6 disk space, allocation 6-23
NS_CACHE 3-16 extents
NUMFDSERVERS 3-16 advantage of performance 5-21, 6-13, 6-19, 6-24, 6-25
OFF_RECVRY_THREADS 5-42 space, eliminating interleaved extents 6-25
ON_RECVRY_THREADS 5-42 Cooked file space 5-2, 5-3
ONDBSPACEDOWN 5-33 performance using concurrent I/O 5-2, 5-4
ONLIDX_MAXMEM 7-18 performance using direct I/O 5-2, 5-3
OPCACHEMAX 5-26 Correlated subquery
OPT_GOAL 13-35 effect of PDQ 12-5
OPTCOMPIND 3-3, 3-10, 11-12, 12-13 Cost of user-defined routine 13-38, 13-39
PC_HASHSIZE 4-21, 10-33 Cost per transaction 1-7
PC_POOLSIZE 4-21, 10-33 CPU
PHYSBUFF 4-2, 4-15, 5-34 utilization, improving with MaxConnect 3-26
PHYSFILE 5-32 VP class and NETTYPE 3-14
PLCY_HASHSIZE 4-21 VPs
PLCY_POOLSIZE 4-21 configuration parameters affecting 3-5
PLOG_OVERFLOW_PATH 5-43 effect on CPU utilization 3-20
RESIDENT 4-17 limited by MAX_PDQPRIORITY 3-11
ROOTNAME 5-8 limited by PDQ priority 3-3
ROOTOFFSET 5-8 optimal number 3-10
ROOTPATH 5-8 used by PDQ 12-7
ROOTSIZE 5-8 VPs and fragmentation goals 9-2
RTO_SERVER_RESTART 5-30, 5-31, 5-41, 5-43 CPU VPs
SBSPACENAME 5-14, 5-20 adding automatically 3-20
SBSPACETEMP 5-14, 5-15 CREATE CLUSTER INDEX statement 7-11
SHMADD 4-2 CREATE CLUSTERED INDEX statement 3-3
SHMBASE 4-8 CREATE FUNCTION statement
SHMMAX 4-18, 4-19 selectivity and cost 13-39
SHMTOTAL 4-2, 4-18 virtual-processor class 3-4
SHMVIRT_ALLOCSEG 4-19 CREATE INDEX ONLINE statement 7-16, 7-17
SHMVIRTSIZE 4-2, 4-4, 4-19 CREATE INDEX statement
SINGLE_CPU_VP 3-10 attached index 9-10
STACKSIZE 4-20 detached index 9-12
STAGEBLOB 5-25 FILLFACTOR clause 7-5
STMT_CACHE 13-42 generic B-tree index 7-23
STMT_CACHE_HITS 4-21, 4-27, 4-28, 4-29, 4-30, 4-32, parallel build 12-3
4-34, 4-36 TO CLUSTER clause 6-25
STMT_CACHE_NOLIMIT 4-21, 4-27 USING clause 7-24
STMT_CACHE_NUMPOOL 4-34 CREATE PROCEDURE statement
STMT_CACHE_SIZE 4-21, 4-31, 4-33 SPL routines, optimizing 10-31
TAPEBLK 5-42 SQL, optimizing 10-31
TAPEDEV 5-42 CREATE TABLE statement
TAPESIZE 5-42 blobspace assignment 5-15
TBLTBLFIRST 6-10 creating system catalog table 5-2
TBLTBLNEXT 6-10 extent sizes 6-20
USELASTCOMMITTED 8-6 fragmenting 9-10, 9-12
USRC_HASHSIZE 4-21 with partitions 9-10, 9-12
USRC_POOLSIZE 4-21 PUT clause 6-19
Index X-5
Denormalizing Disk I/O (continued)
data model 6-42 TPC-A benchmark 1-4
tables 6-43 unbuffered devices 5-12
Detached index Disks
defined 9-12 identifying overloaded ones A-1
extent size 7-5 Distinct data types 7-21
Dimensional tables, defined 13-21 DISTINCT keyword 13-20
Direct I/O Distributed queries
confirming use of 5-4 improving performance 13-29
enabling 5-4 used with PDQ 12-6
overview 5-2, 5-3 Distribution scheme
DIRECT_IO configuration parameter 5-3, 5-4 defined 9-1
DIRECTIVES configuration parameter 11-12, 11-13 designing 9-7, 9-8
Dirty Read isolation level 5-27, 8-10 methods described 9-6, 9-7
Disabilities, visual Dotted decimal format of syntax diagrams B-1
reading syntax diagrams B-1 DRAUTO configuration parameter 5-44
Disability B-1 DRINTERVAL configuration parameter 5-44
Disk DRLOSTFOUND configuration parameter 5-44
and saturation 5-1 DROP DISTRIBUTIONS keywords, in UPDATE STATISTICS
compression 6-47 statement 13-13
critical data 5-5 DROP INDEX ONLINE statement 7-16, 7-17
layout Dropping indexes 7-12
and table isolation 6-2 DRTIMEOUT configuration parameter 5-44
layout, and backup 6-5, 9-4 DS_HASHSIZE configuration parameter 4-21, 4-24
partitions and chunks 5-2 DS_MAX_QUERIES configuration parameter 3-12, 7-19, 13-34
space, storing TEXT and BYTE data 5-18 changing value 12-8
utilization 1-11 index build performance 7-18
Disk access limit query number 12-12
cost of reading row 10-25 MGM 12-6
performance 13-31 DS_MAX_SCANS configuration parameter 3-12, 12-6, 12-11,
performance effect of 10-25 13-34
sequential 13-31 changing value 12-8
sequential forced by query 13-3 MGM 12-6
Disk extent scan threads 12-6
for dbspaces 6-19 DS_NONPDQ_QUERY_MEM configuration parameter 4-8,
for sbspaces 5-21 5-12, 7-19, 13-34
Disk I/O DS_POOLSIZE configuration parameter 4-21, 4-24
allocating AIO VPs 3-8 DS_TOTAL_MEMORY configuration parameter 4-12, 7-18,
background database server activities 1-1 7-19, 13-34
balancing 5-10, 5-14 changing value 12-8
binding AIO VPs 3-8 DS_MAX QUERIES 3-12
blobspace data and 5-16 estimating value 4-12, 12-11
BUFFERPOOL configuration parameter 4-10 MAX_PDQPRIORITY 12-8
contention 10-25 MGM 12-6
effect of UNIX configuration 3-3 setting for DSS applications 12-15
effect of Windows configuration 3-3 setting for OLTP 12-11
effect on performance 5-1 DSS applications
for temporary tables and sort files 5-8 configuration parameter settings 4-9
hot spots, definition of 5-1 DSS resources
in query plan cost 10-1, 10-9, 10-19 limiting 12-8
isolating critical data 5-5 dtcurrent() function, ESQL/C, to get current date and
KAIO 3-8 time 1-7
light scans 5-27 Duplicate index keys, performance effects of 7-11
lightweight I/O 5-23 Dynamic lock allocation 4-2, 4-15, 13-49
log buffer size, effect of 5-7 Dynamic log
logical log 5-24 file allocation
mirroring, effect of 5-6 benefits 5-37
monitoring preventing hangs from rollback of long
AIO VPs 3-8 transaction 5-37
nonsequential access, effect of 7-10 size of new log 5-37
query response time 1-5
reducing 4-10, 6-43
sbspace data and 5-20
sequential scans 5-26
E
Environment variables
simple large objects 5-16
affecting
smart large objects 5-21, 5-23
CPU 3-3
to physical log 5-7
I/O 5-12
Index X-7
Formula (continued) Fragmentation (continued)
buffer pool size 4-10 strategy
connections per poll thread 3-14 ALTER FRAGMENT ATTACH clause 9-20, 9-25
CPU utilization 1-9 ALTER FRAGMENT DETACH clause 9-26, 9-27
data buffer size, estimate of 4-4 distribution schemes for fragment elimination 9-14
decision-support queries 12-11 finer granularity of backup and restore 9-4
disk utilization 1-11 how data used 9-4
DS total memory 4-14 improved performance 9-3
extends, upper limit 6-24 improving 9-9
file descriptors 3-2 increased availability of data 9-3
index extent size 7-5 indexes 9-10
index pages 6-5, 7-5 planning 9-1
initial stack size 4-20 reduced contention 9-3
LOGSIZE 5-36 space issues 9-1
memory grant basis 12-11 temporary tables 9-13
minimum DS memory 4-13 sysfragments system catalog 9-28
number of remainder pages 6-5 TEMP TABLE clause 9-13
operating-system shared memory 4-6 temporary tables 9-13
paging delay 1-10 Freeing shared memory 4-7
partial remainder pages 6-5 Functional index
PDQ resources allocated 3-11 creating 7-25, 7-26
quantum of memory 4-12, 12-6 DataBlade modules 7-26
rows per page 6-5 user-defined function 7-1
scan threads 12-6 using 7-25, 13-2
per query 3-12, 12-11 Functions, ESQL/C, dtcurrent() 1-7
semaphores 3-1
service time 1-8
shared memory
message portion size 4-6
G
Generic B-tree
resident portion size 4-4
index
virtual portion size 4-4
extending 7-23
shared-memory estimate 12-11
parallel UDRs 13-38
shared-memory increment size 4-18
user-defined data 7-1
sort operation, costs 10-24
when to use 7-23
threshold for free network buffers 3-18
Global file descriptor queues 3-22
Fragment
Graph tool (onperf)
elimination
bar graph 14-8
defined 9-14
Configure menu 14-8
equality expressions 9-16
defined 14-3, 14-5
fragmentation expressions 9-14
Graph menu 14-6
range expressions 9-15
Graph tool (onperf)
ID
View menu 14-8
and index entry 7-5
metric
defined 9-12
changing line color and width 14-8
fragmented table 9-5
changing scale 14-10
space estimates 9-5
Metrics menu 14-7
nonoverlapping
pie chart 14-8
multiple columns 9-18
Tools menu 14-9
single column 9-17
greaterthan() function 7-29
overlapping
greaterthanorequal() function 7-29
single column 9-17
GROUP BY
FRAGMENT BY clause 9-10
clause, composite index used 13-20
Fragmentation
clause, indexes 10-22, 13-32
altering fragments 9-27
clause, MGM memory 12-6
FRAGMENT BY EXPRESSION clause 9-10, 9-12
goals 9-1
improving ATTACH operation 9-20, 9-24
improving DETACH operation 9-26, 9-27 H
index restrictions 9-13 Hash join
indexes, attached 9-10 in directives 11-2, 11-5
indexes, detached 9-12 more memory for 5-12, 13-34
monitoring I/O requests 9-28 plan example 10-2
monitoring with onstat 9-28 temporary space 5-12
next-extent size 9-9 when used 10-3
no data migration during ATTACH 9-22 HDR_TXN_SCOPE configuration parameter 5-44
reducing contention 9-3 High-Performance Loader 6-1
smart large objects 9-6 Home pages in indexes 6-5
Index X-9
Join (continued) Lock (continued)
defined 10-2 determining owner 8-13
directive precedence 11-12 dynamic allocation 4-2, 4-15
effects of OPTCOMPIND 10-23 isolation levels and join 10-2
hash 11-10, 12-13, 12-18 promotable 8-9
hash, in directives 11-2, 11-5 retaining update locks 8-9
isolation level effect 10-2 specifying mode 8-4
nested-loop 11-5, 11-7, 11-10 Lock table
OPTCOMPIND 12-13 specifying initial size 4-2, 4-15
optimizer choosing 11-2 LOCKBUFF configuration parameter 4-2
replacing index use 10-4 Locking
selected by optimizer 10-1 byte-range 8-16
star 13-21 Locks
subquery 10-15 byte 8-10
running UPDATE STATISTICS on columns 13-16 byte-range 8-16
semi join 10-15 changing lock mode 8-4
SET EXPLAIN output 12-13 concurrency 8-1
star configuring 8-12
directives 11-8 database 8-3
subquery 12-10 defined 8-1
subquery flattening 10-15 duration 8-5
thread 12-1 exclusive 8-10
three-way 10-4 granularity 8-1
view 12-10 initial number 8-12
with column filters 10-5 intent 8-10
Join and sort, reducing impact 13-32 internal lock table 8-10
isolation level 8-5
key-value 8-2
K maximum number of 4-15, 8-12
maximum number of rows or pages 8-1
Kernel asynchronous I/O (KAIO) 3-8
monitoring 8-11, 8-12, 8-14, 8-18
Key-first scan 10-14
not waiting for 8-5
Key-only index scan 10-1, 10-14, 10-36
page 8-2
row and key 8-1
shared 8-10
L specifying a mode 8-4
Last committed isolation level 8-6 table 8-3
Latch types 8-10
defined 4-39 update 8-10
monitoring 4-39, 4-40 waiting for 8-5
Latency, disk I/O 10-25 LOCKS configuration parameter 4-2, 4-15, 8-12
Leaf index pages, defined 7-1 LOGBUFF configuration parameter 4-15, 5-8, 5-20, 5-34
Leaf scan mode 13-22, 13-27 LOGFILES configuration parameter
Least recently used effect on checkpoints 5-32
flushing 5-45 use in logical-log size determination 5-34
memory management algorithm 1-10 Logging
queues 5-40 checkpoints 5-34
thresholds for I/O to physical log 5-7 configuration effects 5-34
lessthan() function 7-29 dbspaces 5-36
lessthanorequal() function 7-29 disabling on temporary tables 5-39
Light scans I/O activity 5-20
advantages 5-27 LOGSIZE configuration parameter 5-35, 5-36
defined 5-27 none with SBSPACETEMP configuration parameter 5-14
isolation level 5-27 simple large objects 5-16, 5-37
Lightweight I/O smart large objects 5-37
specifying in onspaces 5-24 with SBSPACENAME configuration parameter 5-14
specifying with LO_NOBUFFER flag 5-24 Logical log
when to use 4-10, 5-23, 5-24 assigning files to a dbspace 5-5
LIKE test 13-3 buffer size 4-15
LO_DIRTY_READ flag 8-20 buffered 5-7
LO_NOBUFFER flag, specifying lightweight I/O 5-24 configuration parameters that affect 5-8
LO_TEMP flag data replication buffers 4-39
temporary smart large object 5-14 determining disk space allocated 5-35
LOAD and UNLOAD statements 6-1, 6-25, 6-27, 7-12 logging mode 5-7
Locating simple large objects 6-9 mirroring 5-7
Lock simple large objects 5-37
blobpage 5-16 size guidelines 5-35
Index X-11
Monitoring (continued) NETTYPE configuration parameter (continued)
memory usage 4-4 specifying connections 3-13, 3-14
memory utilization 2-9 Network
MGM resources 12-16 buffer pools 3-17, 3-18
network buffer size 3-19 buffer size 3-19
network buffers 3-18 common buffer pool 3-17, 3-19
OPCACHEMAX 5-24 communication delays 5-1
PDQ threads 12-16 connections 3-13
resources for a session 12-17 free-buffer threshold 3-18
sbspace metadata size 6-11, 6-12 monitoring buffers 3-18
sbspaces 6-13, 6-16 multiplexed connections 3-25
session memory 2-13, 4-4, 4-38, 13-43, 13-44, 13-45, 13-46, performance bottleneck 2-1
13-48, 13-52 performance issues 10-30
sessions 13-48, 13-51 private free-buffer pool 3-17, 3-18
smart large objects 6-13 NEXT SIZE clause 6-20
SPL routine cache 10-33 NFILE configuration parameters 3-2
SQL statement cache 4-30, 4-36, 13-46 NFILES configuration parameters 3-2
entries 13-46 NOFILE configuration parameters 3-2
pool 4-34, 4-35 NOFILES configuration parameters 3-2
size 4-31, 4-32, 4-35 NOVALIDATE keyword
STAGEBLOB blobspace 5-24 in ALTER TABLE statement 6-32
statement cache 4-29 in SET CONSTRAINTS statement 6-32
statement memory 2-13, 13-43, 13-46 in SET ENVIRONMENT statement 6-32
threads 13-48, 13-49, 13-50, 13-51 NS_CACHE configuration parameter 3-16
concurrent users 4-4 NUMFDSERVERS configuration parameter 3-16
per CPU VP 3-10 NVARCHAR data type 6-7
session 3-10, 12-16 table-size estimates 6-7
throughput 1-4
transaction 13-55
UDR cache 10-33
user sessions 13-57
O
Obtaining 6-5
user threads 13-55
OFF_RECVRY_THREADS configuration parameter 5-42
virtual processors 3-21, 3-22
OLTP applications
Monitoring database server
configuration parameter settings 4-9
active tblspaces 6-23
effects of MAX_PDQPRIORITY 3-11
blobspace storage 5-17
effects of PDQ 12-7
buffers 4-10
maximizing throughput with MAX_PDQPRIORITY 12-6,
sessions 2-13, 13-47
12-9
threads 2-7, 13-47
reducing DS_TOTAL_MEMORY 12-11
transactions 13-55
using MGM to limit DSS resources 12-6
virtual processors 3-21
OLTP query 1-2
Monitoring tools
ON_RECVRY_THREADS configuration parameter 5-42
database server utilities 2-4
ON-Bar utility
UNIX 2-3
configuration parameters 5-41
Windows 2-3
onaudit utility 5-45
Motif window manager 14-1, 14-3, 14-4
oncheck utility
Multiple residency
-pB option 2-11, 5-17
avoiding 3-1
-pe option 2-11, 6-14, 6-24, 6-25
Multiplexed connection
-pk option 2-11
defined 3-25
-pK option 2-11
how to use 3-25
-pl option 2-11
performance improvement 3-25
-pL option 2-11
MULTIPROCESSOR configuration parameter 3-10
-pp option 2-11
mwm window manager, required for onperf 14-4
-pP option 2-11
-pr option 2-11, 6-40
-ps option 2-11
N -pS option 2-11, 6-15
NCHAR data type 6-43 -pt option 2-11, 6-5
Negator functions 13-40 -pT option 2-11, 6-40
Nested-loop join 10-2, 11-5 checking index pages 7-21
NET VP class and NETTYPE 3-14 defined 2-11
NETTYPE configuration parameter 3-16, 4-4 displaying
connections 3-18 data-page versions 6-40
estimating LOGSIZE 5-36 free space 6-25
ipcshm connection 3-14, 4-6 free space in index 13-29
network free buffer 3-18 page size 6-40
poll threads 3-1, 3-20 size of table 6-5
Index X-13
Operating system (continued) Optimizer directives (continued)
NOFILE, NOFILES, NFILE, or NFILES configuration ORDERED 11-3, 11-5, 11-6
parameters 3-2 purpose 11-1
semaphores 3-1 SPL routines 11-13
SHMMAX configuration parameter 4-6 star-join 11-3, 11-8
SHMMNI configuration parameter 4-6 types 11-3
SHMSEG configuration parameter 4-6 USE_HASH 11-6
SHMSIZE configuration parameter 4-6 USE_NL 11-6
timing commands 1-6 using DIRECTIVES 11-12
Operator class using IFX_DIRECTIVES 11-12
defined 7-23, 7-28 ORDER BY clause 10-22, 13-32
OPT_GOAL configuration parameter 13-35 Ordered merge 13-37
OPT_GOAL environment variable 13-35 Outer join
OPTCOMPIND effect on PDQ 12-6
directives 11-12 Outer table 10-2
effects on query plan 10-22 Output description
preferred join plan 12-13 onstat -g ssc 4-36
OPTCOMPIND configuration parameter 3-3, 3-10, 12-13 Outstanding in-place alters
OPTCOMPIND environment variable 3-3, 3-10, 12-13 defined 6-40
Optical Subsystem 5-24 displaying 6-40
Optimization goal performance impact 6-40
default total query time 13-35 Overloaded disks A-1
precedence of settings 13-36
setting with directives 11-7, 13-36
total query time 13-35, 13-37
user-response and fragmented indexes 13-37
P
Page
user-response time 13-35, 13-37
cleaning 5-39
Optimization level
memory 1-10
default 13-35
obtaining size 6-5
setting to low 13-35
specifying size for a standard dbspace 4-10, 7-8
table scan versus index scan 13-37
Page size 6-5
Optimizer
obtaining 4-4
autoindex path 13-20
Paging
choosing query plan 11-2, 11-3
defined 1-10
composite index use 13-20
DS_TOTAL_MEMORY 12-11
data distributions used by 13-13
expected delay 1-10
hash join 10-3
monitoring 2-2, 4-10
index not used by 13-3
RESIDENT configuration parameter 4-2
optimization goal 11-7, 13-35
Parallel
SET OPTIMIZATION statement 13-35
access to table and simple large objects 5-20
specifying high or low level of optimization 13-35
backup and restore 5-41
Optimizer directives
executing UDRs 13-38
access method 11-4
index builds 12-3
ALL_ROWS 11-7
inserts and DBSPACETEMP 12-2
altering query plan 11-10
joins 12-9
AVOID_EXECUTE 13-1
scans 12-18, 13-38
AVOID_FULL 11-3, 11-4
sorts
AVOID_HASH 11-6
PDQ priority 13-33
AVOID_INDEX 11-4
when used 5-13
AVOID_INDEX_SJ 10-9, 11-4
Parallel database queries
AVOID_NL 11-3, 11-6
allocating resources 12-7
effect on views 11-6
controlling resources 12-15
embedded in queries 11-1
effect of table fragmentation 12-1
EXPLAIN 11-8, 13-1
fragmentation 9-1
EXPLAIN AVOID_EXECUTE 11-8
how used 12-2
external 11-15
monitoring resources allocated 12-16
external directives 11-1
priority
FIRST_ROWS 11-7
effect of remote database 12-6
FULL 11-4
queries that do not use PDQ 12-4
guidelines 11-3
remote tables 12-6
INDEX 11-4
scans 3-12
INDEX_SJ 11-4
SET PDQPRIORITY statement 12-13
join method 11-6
SPL routines 12-5
join order 11-3, 11-5
SQL 9-1
OPTCOMPIND 11-12
statements affected by PDQ 12-5
Optimizer directives
triggers 12-3, 12-4, 12-5
INDEX_SJ 10-9
user-defined routines 13-38
Index X-15
Query plans (continued) Response time (continued)
all rows 11-7 improving with multiplexed connections 3-25
altering with directives 11-3, 11-10 measuring 1-6
autoindex path 13-20 Response times
avoid query execution 11-8 SQL statement cache 13-40
chosen by optimizer 11-3 Root dbspace
collection-derived table 10-16 mirroring 5-6
disk accesses 10-5 Root index page 7-1
displaying 10-10, 13-30 ROOTNAME configuration parameter 5-8
first-row 11-7 ROOTOFFSET configuration parameter 5-8
fragment elimination 9-29, 12-18 ROOTPATH configuration parameter 5-8
how the optimizer chooses one 11-2 ROOTSIZE configuration parameter 5-8
indexes 10-7 Round-robin distribution scheme 9-7
join order 11-10 Round-robin fragmentation, smart large objects 9-6
pseudocode 10-4, 10-6 Row access cost 10-25
restrictive filters 11-2 Row pointer
row access cost 10-25 attached index 9-10
time costs 10-4, 10-23, 10-24, 10-25 detached index 9-12
Query statistics 10-11 in fragmented table 9-5
Query-tree tool (onperf) 14-3, 14-11 space estimates 7-5, 9-5
RTO_SERVER_RESTART configuration parameter 5-27, 5-30,
5-31, 5-41, 5-43
R RTO_SERVER_RESTART policy 5-30, 5-32, 5-35
R-tree index
defined 7-4, 7-25
using 7-22 S
Range expression, defined 9-15 Sampling
Range scan mode 13-27 in UPDATE STATISTICS LOW operations 13-18
Raw disk space 5-2, 5-3 sar command 2-3, 4-10
Read cache rate 4-10 Saturated disks 5-1
Read-ahead sbspace extents
configuring 5-26 performance 5-21, 6-12
defined 5-26 SBSPACENAME
Reclaiming empty extent space 6-26 configuration parameter 5-14
Recovery time objective logging 5-14
Recovery point objective 5-35 SBSPACENAME configuration parameter 5-20
Redundant data, introduced for performance 6-45 sbspaces
Redundant pairs, defined 10-9 configuration impacts 5-20
Referential constraints 6-32 creating 5-21
Regular expression, effect on performance 13-3 defined 5-6
Relational model estimating space 6-11
denormalizing 6-42 extent 5-21, 5-23
Remainder pages metadata requirements 6-11
tables 6-5 metadata size 6-11, 6-12
Remote database monitoring 6-13
effect on PDQPRIORITY 12-6 monitoring extents 6-14, 6-15
RENAME statement 10-32 SBSPACETEMP
Repeatable Read isolation level 5-27, 8-7, 10-22 no logging 5-14
Residency 4-17 SBSPACETEMP configuration parameter 5-14, 5-15
RESIDENT configuration parameter 4-17 Scans
Resident portion of shared memory 4-2, 4-4 bufferpool 5-27
Resizing table to reclaim empty space 6-26 DS_MAX_QUERIES 12-6
Resource utilization DS_MAX_SCANS 12-6
capturing data 2-3 first-row 10-15
CPU 1-9 key-only 10-1
defined 1-8 light 5-27
disk 1-11 lightweight I/O 5-23
factors that affect 1-12 limited by MAX_PDQPRIORITY 3-11
memory 1-10 limiting number 12-11
operating-system resources 1-7 limiting number of threads 12-11
performance 1-7 limiting PDQ priority 12-11
Resources memory-management system 1-10
critical 1-7 parallel 12-18
Response time parallel database query 3-12
actions that determine 1-5 read-ahead I/O 5-26
contrasted with throughput 1-5 sequential 5-26
improving with MaxConnect 3-26 skip-duplicate-index 10-15
Index X-17
Simple large objects (continued) SPL routines (continued)
logging 5-16 when executed 10-33
logical-log size 5-37 when optimized 10-31
Optical Subsystem 5-24 SQL statement cache
SINGLE_CPU_VP configuration parameter 3-10 changing size 4-32
slow alter algorithm cleaning 4-31
restrictions 6-36 defined 13-40
Smart large objects effect on prepared statements 13-41
ALTER TABLE 6-19 enabling 13-42
buffer pool 4-10, 5-20, 5-23 exact match 13-43
buffer pool usage 6-17 flushing 13-41
buffering recommendation 5-24 hits 4-21, 4-27, 4-28, 4-29, 4-30, 4-32, 4-33, 4-34, 4-36
changing characteristics 6-19 host variables 13-41
CREATE TABLE statement 6-19 memory 4-21
data integrity 6-17 memory limit 4-27, 4-33
DataBlade API functions 5-20, 5-21, 6-17, 6-23, 8-20 monitoring 4-29, 4-30, 4-36
disk I/O 5-20 monitoring dropped entries 13-46
ESQL/C functions 5-20, 5-21, 6-17, 6-23, 8-20 monitoring pools 4-34, 4-35
estimating space 6-11 monitoring session memory 13-43, 13-44, 13-45, 13-46
extent size 5-21, 5-22, 6-17, 6-23 monitoring size 4-31, 4-32, 4-35
fragmentation 6-17, 9-6 monitoring statement memory 2-13, 13-43, 13-46
I/O operations 5-23, 6-12 nonshared entries 4-30
I/O performance 4-10, 5-20, 5-23, 6-12 number of pools 4-34
last-access time 6-17 performance benefits 4-25, 13-40
lightweight I/O 4-10, 5-23, 5-24 response times 13-40
lock mode 6-17 size 4-21, 4-31, 4-33
logging status 6-17 specifying 13-42
logical-log size 5-37 STMT_CACHE configuration parameter 4-27, 13-42
mirroring chunks 5-6 STMT_CACHE environment variable 13-42
monitoring 6-13 STMT_CACHE_SIZE configuration parameter 4-32
sbspace name 6-17 when to enable 13-42
sbspaces 5-20 when to use 13-41
setting isolation levels 8-20 SQLCODE field of SQL Communications Area 6-44
size 6-17 sqlhosts file
specifying characteristics 6-19 client buffer size 3-19
specifying size 5-21, 6-23 multiplexed option 3-25
storage characteristics 6-17 sqlhosts information
SMI tables connection type 3-13, 3-14, 3-16
monitoring latches 4-40 connections 4-19
monitoring sessions 13-54 number of connections 5-36
monitoring virtual processors 3-23 SQLWARN array 5-29
Snowflake schema 13-21 Stack
Sort memory 7-19 specifying size 4-20
Sorting STACKSIZE configuration parameter 4-20
avoiding with temporary table 13-33 STAGEBLOB configuration parameter 5-24
costs 10-24 defined 5-25
DBSPACETEMP configuration parameter 5-8 Staging area
DBSPACETEMP environment variable 5-8 optimal size for blobspace 5-25
effect of PDQ priority 13-18 standards xxii
effect on performance 13-32 Star join, defined 13-21
estimating temporary space 7-20 Star schema 13-21
memory estimate 7-19 Star-join directives 11-8
PDQ priority for 7-19 Statistics
query-plan cost 10-1 automatically generated 13-11
sort files 5-8 Status tool (onperf) 14-3, 14-11
triggers in a table hierarchy 10-36 STMT_CACHE environment variable 13-42
Space STMT_CACHE_HITS configuration parameter 4-21, 4-27,
reducing on disk 6-46, 6-47 4-28, 4-29, 4-30, 4-32, 4-34, 4-36
SPL 10-33 STMT_CACHE_NOLIMIT configuration parameter 4-21, 4-27
SPL routines STMT_CACHE_NUMPOOL configuration parameter 4-34
automatic reoptimization 10-32 STMT_CACHE_SIZE configuration parameter 4-21, 4-31, 4-33
display query plan 10-31 Storage characteristics
effect Smart large objects
of PDQ 12-5 last-access time 6-17
of PDQ priority 12-10 system default 6-17
optimization level 10-32 Storage spaces
query response time 1-5 for encrypted values 4-40, 10-29
Index X-19
System catalog tables (continued) TCP/IP buffers 3-17
systrigbody 10-34 TEMP or TMP user environment variable 5-8
systriggers 10-34 TEMP TABLE clause of the CREATE TABLE statement 5-8,
updated by UPDATE STATISTICS 10-20 5-14, 9-13
System resources, measuring utilization 1-7 Temporary dbspace
System-monitoring interface 2-4 creating 7-18
DBSPACETEMP configuration parameter 5-11
DBSPACETEMP environment variable 5-12
T for index builds 7-18, 7-20
onspaces -t 5-10
Table
optimizing 5-10
adding redundant data 6-45
root dbspace 5-8
assigning to dbspace 6-1
Temporary sbspace
companion, for long strings 6-43
configuring 5-14
configuring I/O for 5-26
onspaces -t 5-14
cost of access 13-31
optimizing 5-14
denormalizing 6-43
SBSPACETEMP configuration parameter 5-15
division by bulk 6-44
Temporary smart large object
estimating
LO_TEMP flag 5-14
blobpages in tblspace 6-8
Temporary tables
data page size 6-5
configuring 5-8
size with fixed-length rows 6-5
DBSPACETEMP configuration parameter 5-8, 5-11
size with variable-length rows 6-7
DBSPACETEMP environment variable 5-12
expelling long strings 6-43
decision-support queries 9-4
fact 13-21
Decision-support queries
frequently updated attributes 6-44
use of temporary files 9-4
infrequently accessed attributes 6-44
explicit 9-13
isolating high-use 6-2
fragmentation 9-13
locks 8-3
in root dbspace 5-5
managing
speeding up a query 13-33
extents 6-19
Temporary dbspace
managing indexes for 7-8
decision-support queries 9-4
nonfragmented 6-5
TEMPTAB_NOLOG configuration parameter 5-39
partitioning, defined 9-1
TEXT data type 6-43
placement on disk 6-1
in blobspace 5-15
reducing contention between 6-2
in table-size estimate 6-5
redundant and derived data 6-45
locating 6-9
remote, used with PDQ 12-6
memory cache 5-24
rows too wide 6-44
on disk 6-9
shorter rows 6-43
parallel access 5-20
size estimates 6-5
staging area 5-24
Table
Thrashing, defined 1-10
splitting if too wide 6-44
Thread-safe
temporary 6-4
UDRs 13-38
Table distributions
Threads
automated UPDATE STATISTICS 13-6
DS_MAX_SCANS configuration parameter 12-6
Table hierarchy
MAX_PDQPRIORITY 3-11
SELECT triggers 10-36
monitoring 2-7, 4-4, 13-47, 13-48, 13-49, 13-50, 13-51
Table scan
page-cleaner 5-7
defined 10-1
primary 12-1, 12-16
nested-loop join 10-2
secondary 12-1, 12-18
OPTCOMPIND 3-10
sqlexec 5-39, 12-16
replaced with composite index 13-20
Throughput
tables
benchmarks 1-4
defragmenting 6-28
capturing data 1-4
TAPEBLK configuration parameter 5-42
contrasted with response time 1-5
TAPEDEV configuration parameter 5-42
measure of performance 1-3
TAPESIZE configuration parameter 5-42
measured by logged COMMIT WORK statements 1-4
Tblspace
Tightly coupled 13-55, 13-57
attached index 9-12
Time
defined 6-5
getting current in ESQL/C 1-7
extent size for tblspace tblspace 6-10
getting user, processor and elapsed 1-6
monitoring
getting user, system, and elapsed 1-6
active tblspaces 6-23
time command 1-6
simple large objects 6-9
Timing
TBLTBLFIRST configuration parameter 6-10
commands 1-6
TBLTBLNEXT configuration parameter 6-10
functions 1-7
TCP connections 3-14, 3-20
Index X-21
Utilities (continued) Utilities (continued)
oncheck (continued) onstat utility (continued)
-pr option 2-11, 6-40 -g smb option 6-13
-ps option 2-11 -g smb s option 6-16
-pS option 2-11, 6-15 -g sql option 2-13
-pt option 2-11, 6-5 -g ssc option 13-46
-pT option 2-11, 6-40 -g stm option 2-9, 4-4, 4-38, 13-48, 13-52
and index sizing 7-5 -g sts option 4-4
introduced 2-11 -k option 8-11, 8-13
monitoring table growth 6-20 -l option 2-6
onload and onunload 5-42, 6-1, 6-25, 6-27 -m option 5-31
onlog 1-4, 2-12 -O option 5-24
onmode -p option 1-4, 2-6, 4-10, 4-39, 8-12, 8-15
-F option 4-7 -P option 2-6
-p option 3-20 -R option 2-6
-P option 3-8 -s option 4-40
-W option to change STMT_CACHE_NOLIMIT 4-33 -u option 2-6, 4-4, 8-12, 8-13, 12-16, 13-48
forced residency 4-17 -x option 2-6
shared-memory connections 3-1 monitoring buffer pool 4-10
onparams 5-5, 5-7 monitoring threads per session 3-10
onperf ontape utility 5-42
activity tools 14-12 Utilization
data flow 14-1 capturing data 2-3
defined 14-1 CPU 1-9, 3-1, 3-25
graph tool 14-5 defined 1-7
metrics 14-13 disk 1-11
query-tree tool 14-11 factors that affect 1-12
replaying metrics 14-3 memory 1-10
requirements 14-3 service time 1-8
saving metrics 14-2
starting 14-4
status tool 14-11
tools 14-3
V
VARCHAR data type
user interface 14-5
access plan 10-1
onspaces
byte locks 8-10
-Df BUFFERING tag 5-24
costs 10-29
-Df option 5-21, 6-19
expelling long strings 6-43
-S option 6-19
in table-size estimates 6-7
-t option 5-10, 5-14, 6-4, 7-18
when to use 6-43
EXTENT_SIZE flag for sbspaces 5-21
Variable-length rows 6-7
sbspaces 5-20
View
onstat utility
effect of directives 11-6
-- option 2-6
Virtual memory, size 4-6
-a option 2-6
Virtual portion 4-2, 4-4, 4-19
-b option 2-6, 4-4, 6-5, 6-8
Virtual processors
-d option 3-22, 6-11, 6-12
adding 3-20
-F option 5-39
class name 3-4
-g act option 13-48, 13-50
CPU 3-20
-g afr option 3-19
monitoring 3-21, 3-22
-g ath option 3-10, 12-16, 13-48, 13-49, 13-51
multicore processors 3-5
-g cac option 4-29
NETTYPE 3-14
-g cac stmt option 4-29
network, SOC or TLI 3-20
-g dsc option 4-24
poll threads for 3-14, 3-20
-g glo option 3-21
processor affinity 3-5
-g ioq option 3-8, 3-22
semaphores required 3-1
-g mem option 2-9, 13-48, 13-52
setting number of CPU VPs 3-5
-g mgm option 2-9, 12-6, 12-16
setting number of NET VPs 3-14
-g ntm option 3-18
starting additional 3-20
-g ntu option 3-18
user-defined 3-4
-g option 2-6
Visual disabilities
-g osi option 2-9
reading syntax diagrams B-1
-g ppf option 9-28
vmstat command 2-3, 4-10
-g prc option 10-33
VP_MEMORY_CACHE_KB configuration parameter 3-24
-g rea option 3-22
VPCLASS configuration parameter
-g scn option 5-27
process priority aging 3-6
-g seg option 2-9, 4-18
processor affinity 3-5
-g ses option 2-9, 2-13, 3-10, 4-4, 12-17, 13-48, 13-51
setting number of AIO VPs 3-8
W
WHERE clause 10-22, 13-2, 13-3
Windows
NETTYPE configuration parameter 3-14
network protocols 3-13, 3-16
parameters that affect CPU utilization 3-3
Performance Logs and Alerts 1-6, 2-3
TEMP or TMP user environment variable 5-8
Write once read many
optical subsystem 5-24
X
X display server 14-4
Index X-23
X-24 IBM Informix Performance Guide
Printed in USA
SC27-3544-06
Spine information:
Informix Product Family Informix Version 11.70 IBM Informix Performance Guide