Oracle Database 10G: Managing Oracle On Linux For Dbas: D46590Gc10 Edition 1.0 January 2007 D49271
Oracle Database 10G: Managing Oracle On Linux For Dbas: D46590Gc10 Edition 1.0 January 2007 D49271
Oracle Database 10G: Managing Oracle On Linux For Dbas: D46590Gc10 Edition 1.0 January 2007 D49271
D46590GC10
Edition 1.0
January 2007
D49271
®
Authors Copyright © 2007, Oracle. All rights reserved.
Disclaimer
Tom Best
S. Matt Taylor Jr. This document contains proprietary information and is protected by copyright and
other intellectual property laws. You may copy and print this document solely for your
Technical Contributors own use in an Oracle training course. The document may not be modified or altered in
any way. Except where your use constitutes "fair use" under copyright law, you may
and Reviewers not use, share, download, upload, copy, print, display, perform, reproduce, publish,
license, post, transmit, or distribute this document in whole or in part without the
Maria Billings express authorization of Oracle.
Harald Breederode
The information contained in this document is subject to change without notice. If you
MJ Bryksa
find any problems in the document, please report them in writing to: Oracle University,
Al Flournoy 500 Oracle Parkway, Redwood Shores, California 94065 USA. This document is not
Mark Fuller warranted to be error-free.
Sush Jagannath Restricted Rights Notice
Donna Keesling
Sergio Leunissen If this documentation is delivered to the United States Government or anyone using
the documentation on behalf of the United States Government, the following notice is
Greg Marsden applicable:
Prasanth Narayanan
U.S. GOVERNMENT RIGHTS
Abhishek Singh The U.S. Government’s rights to use, modify, reproduce, release, perform, display, or
James Spiller disclose these training materials are restricted by the terms of the applicable Oracle
Herbert van den Bergh license agreement and/or the applicable U.S. Government contract.
James Womack Trademark Notice
Editors Oracle, JD Edwards, PeopleSoft, and Siebel are registered trademarks of Oracle
Corporation and/or its affiliates. Other names may be trademarks of their respective
Aju Kumar owners.
Atanu Raychaudhuri
Graphic Designer
Samir Mozumdar
Publisher
Srividya Rameshkumar
Preface
Profile
Before You Begin This Course
Before you begin this course, you should have working knowledge with
administering an Oracle database.
How This Course Is Organized
Oracle Database 10g: Managing Oracle on Linux for DBAs is an instructor-led
course featuring lectures and hands-on exercises. Online demonstrations and written
practice sessions reinforce the concepts and skills that are introduced.
Preface - 3
Related Publications
Oracle Publications
Title Part Number
Oracle® Database Administrator's Guide
10g Release 2 (10.2) B14231-02
Oracle® Database Installation Guide
10g Release 2 (10.2) for Linux x86 B15660-02
Oracle® Database Release Notes
10g Release 2 (10.2) for Linux x86 B15659-05
Additional Publications
• System release bulletins
• Installation and user’s guides
• read.me files
• International Oracle User’s Group (IOUG) articles
• Oracle Magazine
Preface - 4
Typographic Conventions
What follows are two lists of typographical conventions that are used specifically
within text or within code.
Typographic Conventions Within Text
Convention Object or Term Example
Uppercase Commands, Use the SELECT command to view
functions, information stored in the LAST_NAME
column names, column of the EMPLOYEES table.
table names,
PL/SQL objects,
schemas
Lowercase, Filenames, where: role is the name of the role
italic syntax variables, to be created.
usernames,
passwords
Initial cap Trigger and Assign a When-Validate-Item trigger to
button names the ORD block.
Choose Cancel.
Italic Books, names of For more information on the subject see
courses and Oracle SQL Reference
manuals, and Manual
emphasized
words or phrases Do not save changes to the database.
Quotation marks Lesson module This subject is covered in Lesson 3,
titles referenced “Working with Objects.”
within a course
Preface - 5
Typographic Conventions (continued)
Typographic Conventions Within Code
Convention Object or Term Example
Uppercase Commands, SELECT employee_id
functions FROM employees;
Lowercase, Syntax variables CREATE ROLE role;
italic
Initial cap Forms triggers Form module: ORD
Trigger level: S_ITEM.QUANTITY
item
Trigger name: When-Validate-Item
. . .
Lowercase Column names, . . .
table names, OG_ACTIVATE_LAYER
filenames, (OG_GET_LAYER ('prod_pie_layer'))
PL/SQL objects . . .
SELECT last_name
FROM employees;
Bold Text that must ./runInstaller
be entered by a
user
Preface - 6
Introduction
1: Introduction
2: Preparing Linux for Oracle
3: Installing Oracle on Linux
4: Managing Storage
5: Automatic Storage Management
6: Creating the Database
EL 4
RHEL AS/ES 3
RHEL AS/ES 4
SuSE SLES-9
SuSE SLES-10
Asianux 2.0
Linux Packages
Modules
The modules in a Linux distribution can be either kernel based or loadable. Kernel-based modules
have to be compiled in, whereas loadable modules can be added without recompiling. All the stock
kernels include support for loadable modules.
Packages
The Linux software for most of the supported distributions is assembled into packages. These are
precompiled binaries that can be installed on and removed from your system with a package
manager. With packages, you can update the kernel, patch, and add functionality. Some third-party
vendors distribute binary modules that are loaded into the kernel. These modules may be proprietary
modules where the source code is not available for Linux developers to investigate. If there are such
modules loaded, then Oracle will support that OS fully, except in the event that the problem is caused
by a proprietary module. In this case, support will be deferred to the supplier of that module.
Note: Package management is covered in detail in the lesson titled “Preparing Linux for Oracle.”
Linux Kernel
Linux, like most modern operating systems, has a kernel that is loaded at boot time and stays in the
memory. The kernel in Linux, unlike in many other operating systems, can be customized infinitely.
You can include or exclude modules that make up the kernel, by using the kernel configuration file.
If the kernel still does not behave to your satisfaction, source code is available for you to change it to
your specification. This can lead to a support problem. If you are running a kernel that is different
than the one that was tested by the vendor, then Oracle Support Services (OSS) cannot identify where
the problem is, in your database or the OS.
Oracle Corporation supports only stock kernels. That means only kernels supplied by the certified
distributions are supported. Kernels that are patched by packages by the kernel vendor are also
supported. For more information about kernel support, see note 228374.1, Linux Operating System
Support, on MetaLink.
# uname -r
2.6.9-42.0.0.0.1.ELsmp
<Base>-<Errata>.<Mod><Type>
Integers separated
by periods
None: Uniprocessor
smp: SMP up to 16 GB
hugemem: SMP up to 64 GB
Loadable Kernel
Modules (LKMs)
T
Kernel
# cat /proc/sys/kernel/tainted
0
0 = not tainted
1 = tainted
• Running lsmod to list the status of modules:
# lsmod
Module Size Used by Not tainted
oracleasm 48020 1
loop 15817 4
Supported Hardware
Oracle Corporation supports a given Linux distribution on any platform or drivers that the vendor
supports. For details about hardware configurations that are certified with Enterprise Linux, see
http://linux.oracle.com/hardware.html.
Note
Use the man command to see the manual pages for any command. For example, the following
displays the manual pages for the touch command:
$ man touch
$ whoami
oracle Each of these
$ cd changes to current
$ cd ~ user's home directory
$ pwd
/home/oracle
$ ls -l
total 40
drwxrwxrwx 2 root root 4096 Oct 4 10:07 Desktop
-rwxrwxrwx 1 tbest oracle 57 Dec 7 2005 diary
drwxr-xr-x 3 oracle oinstall 4096 Sep 30 13:48 osw
drwxrwxrwx 2 oracle oinstall 4096 Oct 9 14:08 prog
drwxr-xr-x 3 oracle oinstall 4096 Sep 29 11:46 rda
drwxr-xr-x 3 oracle oinstall 4096 Sep 30 12:07 rda2
drwxr-xr-x 2 oracle oinstall 4096 Oct 11 11:47 stuff
But there are also virtual files stored under subdirectories, such as:
• net: Network-related information
• sys: Low-level operating system settings, including kernel parameters
# cd /proc/sys/kernel
# ls -l tainted
-r--r--r-- 1 root root 0 Dec 28 18:14 tainted
# echo 0 >tainted
-bash: tainted: Operation not permitted
1 4
$ command
$ command1
>
|
File $ command2
2 | tee
$ command $ command3
<
|
File
File2 $ command4
3 |
$ command1 $ command2
Syntax:
case value in
pattern1) commands1;;
pattern2) commands2;;
esac
Example: case $1 in
up) sqlplus / as sysdba <<-END1
startup;
END1
;;
down) sqlplus / as sysdba <<-END2
shutdown immediate;
END2
;;
*) echo Unknown;;
esac
Syntax:
while condition
do
commands
done
Example:
ps -ef | grep -i oracle |\
awk '{print $2}' |\
while read PID
do
kill -9 ${PID}
done
Example: 2
for mnt in `df | grep / | awk '{print $6}'`;
do
echo -n $mnt ' '
ls -l $mnt | wc -l
done
Syntax:
for (init; cond; stmt)
do
commands
done
Example:
for ((j=2; j<=8; j+=2))
do
grep my_name file$j
done
# cd /proc/sys
# ls
debug dev fs kernel net proc sunrpc vm
# cd net
# ls
core ethernet ipv4 ipv6 token-ring unix
# cd core
# ls rmem*
rmem_default rmem_max
# cat rmem_max
262144
New value
Or
/etc/sysctl.conf
Persistent: Set
+ during each startup
# sysctl -p
Semaphores
Semaphores are a robust method of controlling access to critical resources. The Oracle instance uses
semaphores primarily to control access to shared memory. Semaphores are allocated based on the
PROCESSES initialization parameter.
Each Oracle instance tries to allocate one semaphore set at startup. This set must allocate at least as
many semaphores as the value of PROCESSES. If it does not, the Oracle instance gets more sets to
satisfy the number of semaphores that it needs. If the instance cannot allocate enough semaphores
(either in one set or in multiple sets), then the instance does not start.
A semop call is a call to a function that actually uses the semaphores (for example, testing, setting,
and clearing).
Note: For details about setting semaphore parameters for your Oracle instance, refer to the Oracle
Database installation guide for your particular platform.
kernel.sem
semmni
fs.file-max 65536
Minimum Maximum
net.core.rmem_default 1048576
net.core.rmem_max 1048576
net.core.wmem_default 262144
net.core.wmem_max 262144
Processes Files
Shell Limits
Oracle-Related Limits
Two limits must be set in order for an Oracle database to function properly. These apply to the
oracle Linux user. There is little to be gained by not setting each of these to a conservatively high
value.
• nofile: The maximum number of files the user can have open at one time. The oracle user
opens initialization files, data files, redo log files, and so on, so this should be set high enough to
have those files open simultaneously. This is set very close to the maximum number of database
files in a single database.
• nproc: The maximum number of processes a given user can run at once. The oracle Linux
user owns and starts all the background processes, server processes, and possibly the parallel
query and dispatcher processes. This number must be set high enough to accommodate that. The
typical increase in process count comes from the number of sessions in the database. So you just
need to set this parameter high enough to manage the highest number of sessions, plus some for
other processes.
• To install a package:
# rpm –ivh <package name>
• To remove a package:
# rpm –e <package name>
Managing Packages
The recommended method for installing operating system patches and additional functionality is to
apply packages from the distribution vendor. Apply packages with rpm, a command-line utility with
several options for querying, verifying, installing, upgrading, and removing packages.
• -i: Installs the named package. The optional –h argument shows hash marks on the screen to
indicate progress. The optional –v argument generates verbose output. For example:
#rpm -ivh my_new_package-1.0-14.i386.rpm
Preparing... ####################################### [100%]
1:my_new_package ####################################### [100%]
• -e: Erases/removes the selected package. Optional arguments –h and –v can also be used. For
example:
# rpm -ev IBMJava2-JRE
• -U: Upgrades the current package. It removes the current package and replaces it with a
different version.
Note: The replacement happens without removing any existing configuration files. This is used
when upgrading kernel packages because you cannot remove the existing kernel before
installing a new one.
rpm -Uh kernel-2.4.9-e.27.i686.rpm
Preparing... ####################################### [100%]
1:kernel-2.4.9-e27 ####################################### [100%]
You may have to query all of the packages and grep for
part of the name if you are unsure of the exact name:
# rpm -q compat-libstdc++
package compat-libstdc++ is not installed
# rpm -qa | grep compat-libstdc++
compat-libstdc++-296-2.96-132.7.2
compat-libstdc++-33-3.2.3-47.3
# rpm -q compat-libstdc++-296
compat-libstdc++-296-2.96-132.7.2
Required Software
The Oracle Database Installation Guide lists the required packages and version numbers for your
Linux distribution. You can determine whether your Linux installation meets those package
requirements by using the rpm command. The –q option queries the repository of installed packages
looking for the exact package name you supply. You must supply the package name exactly as it
appears in the repository; if it is wrong, it is reported as not installed.
The first example in the slide shows a query of the binutils package. The result shows that it is
indeed installed, and the version number is 2.15.92.0.2-15.
The second example shows a query for the compat-libstdc++ package. Initially, you see that it
is not installed. Then, when you query all packages, and then grep for the package name or portion
thereof, you can see that there must be more to the package name than compat-libstdc++. If
you query for compat-libstdc++-296, you see that it is installed as that package name.
Linux Patches
Linux distribution vendors do not supply megapatches like other UNIX vendors, but supply patches
for components. The latest version of all the patches are tested to work together. Each distribution
vendor has a subscription service and a method for distributing patches. Enterprise Linux is
supported by Unbreakable Linux Network. The latest patches should be applied; especially the kernel
patches and the security patches.
Note: Oracle Enterprise Manager Database Control and Grid Control provide an interface for finding
and applying patches for the OS and for the database. For details, see Oracle Database 2 Day DBA
10g Release 2.
Creating Groups
The Oracle Database installation guide names three group identifiers: the Oracle Inventory group,
OSDBA, and OSOPER. The members of the OSDBA group own the database files and have the
privilege to connect to the database without a password, using AS SYSDBA through OS
authentication. The OSOPER group uses the same mechanism to connect with AS SYSOPER with a
restricted set of privileges. Each database may have its own OSDBA and OSOPER groups. The Oracle
Inventory group has privileges on the software. A DBA that is a member of both OSDBA and the
Oracle Inventory group has privileges in specific database instances and the ability to access and
upgrade the software through the Oracle Inventory group.
An OSDBA group must be created to manage the database files. By default this group is dba, but this
name is arbitrary. If only one group of DBAs administers the databases on this server, then this group
may be the same as the Oracle Inventory group for all the databases on this server.
1
# useradd oracle
2
# usermod -g oinstall -G dba,oper,oracle oracle
3
# passwd oracle
Changing password for user oracle.
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
#
# id nobody
uid=99(nobody) gid=99(nobody) groups=99(nobody)
Owner
Directory Group
indicator
World
$ ls -l
total 3
drwxr-xr-x 2 oracle oinstall 4096 Sep 17 13:02 backup
-rwxr-xr-x 1 oracle dba 87 Dec 22 2006 lab_01.sql
-rwx------ 1 root root 24 Feb 15 2006 postinst.sh
Owner Group
oradism
SETUID SETGID
tbest xyz.conf
SUID Bits
When the letter “s” appears in place of the “x” or dash in the permissions bit string, it has a special
meaning. It indicates that as this program is invoked, it will be run using the permissions of the user
or group that owns it. Regarding the example of the oradism executable file shown in the slide,
oinstall is the group and root is the owner. An “s” appears in the executable bit position, for
the user and the group. That is, anyone can execute oradism, and when they do, every action the
oradism program attempts will be evaluated, permissions-wise, based on the fact that the
requesting user is root and the requesting group is oinstall.
For example, suppose that the tbest user is in the tbest and users groups only. There is a
/etc/xyz.conf file which is owned by root, and its group is root. The tbest user cannot
access that file at all. But if the oradism program accesses that file, anyone who invokes that
program will succeed in accessing that file. This is because the decision to allow access to the
/etc/xyz.conf file is made based on the fact that the requestor is the root user.
Choosing a Shell
Linux has many popular features. One of the most popular features is that the user interface can be
easily customized. There are at least four different shells in the default EL installation. The Bourne
shell, one of the first shells that was available on AT&T UNIX, is the standard shell that the Oracle
database uses for scripting. The C shell was the BSD UNIX shell. The C shell has many user-friendly
features, such as command history, that are not available in the Bourne shell. But the C shell was
interpreted, and not compiled like the Bourne shell, and is consequently slower. The Korn shell is a
superset of the Bourne shell with command history, and is compiled. The Bourne-again shell, bash,
was written to be Bourne shell compatible, with features of both the Korn and C shells. It is intended
to be POSIX compliant. Some Oracle installation actions require a Korn shell such as the Oracle
Internet Directory configuration script, oidca.
Most Oracle shell scripts specify the shell that is required in the first line of the script. A first line of
#!/bin/sh indicates that the script uses the Bourne shell. On Linux, /bin/sh is usually a link to
the Bourne-again shell (bash).
To view the shell that you are currently in, use the echo $0 command.
• C shell syntax:
setenv ORACLE_SID orcl
For the oracle user, you must set the following shell
variables before the installer is run:
• DISPLAY (required)
– Location of the X Window System server
• PATH (required)
– Location of the OS linker and compiler
– Location of the Oracle database executables
OFA Characteristics
The OFA characteristics translate into specific recommendations for the management of database
files, and the structure of the host file systems and directory structure. They are listed here:
• File system organization: The file system is organized to enable easy administration and to
facilitate adding data into existing databases, adding users, creating databases, and adding
hardware.
• Distributed I/O loads: I/O loads are distributed across enough disk drives to prevent performance
bottlenecks.
• Hardware support: In most cases, you do not require new hardware to implement the Optimal
Flexible Architecture standard.
• Safeguards Against Drive Failures: By distributing applications across more than one drive,
drive failures affect as few applications as possible.
• Distribution of Oracle home directories: The collection of home directories, and further, the
contents of each individual home directory can be distributed across disk drives.
• Integrity of login home directories: You can add, move, or delete login home directories without
having to revise programs that refer to them.
Mount Points
A database that is fully OFA compliant has at least two mount points: one for the software and at
least one for database files. The Oracle Installer requests two directories. The installer does not know
if you are supplying mount points or directories. The naming of the database software directory was
performed when you created the ORACLE_HOME environment variable.
Although there are many correct names that you can use, it is usually a good idea to avoid names that
map to hardware. For example, if you choose a name that corresponds to a particular disk controller,
then a later upgrade may result in a name that is deceptive. It is much easier to separately track
hardware to logical name mappings.
You should not use names that suggest that a mount point is devoted to one application’s data unless
it is true. For example, you do not want to neglect your Oracle database files just because they were
stored under a mount point called /backup01.
In the original OFA, the concept of a mount point almost always referred to a single physical disk,
that is, each disk had one mount point. In today’s environment, disks can be very large, and they are
partitioned to useful sizes. Each partition requires a mount point.
Note: The usage of Automatic Storage Management (ASM) eliminates many of the concerns about
mount points. ASM is covered in detail in the lesson titled “Automatic Storage Management and
Automatic Storage Management Library Driver.”
ORACLE_BASE /u01/app/oracle
<version>
ORACLE_HOME /u01/app/oracle/product/10.2.0/db_1
PATH=/usr/bin:/bin:/usr/bin/X11/:/usr/local/bin
ORACLE_SID=orcl
ORAENV_ASK=NO
. oraenv
ORAENV_ASK=
$ cd
$ pwd
/home/oracle
$ /media/cdrom/database/runInstaller
The log files for the Oracle Universal Installer are in this
directory: $ORACLE_BASE/oraInventory/logs
$ cd $ORACLE_BASE/oraInventory/logs
$ ls *.log
installActions2005-07-07_01-42-01AM.log
INFO: Configuration assistant "Oracle Database
Configuration Assistant" succeeded
INFO: Command =
/u01/app/oracle/product/10.2.0/db_1/bin/isqlplusctl
start
iSQL*Plus 10.2.0.1.0
Copyright (c) 2003, 2005, Oracle. All rights reserved.
$ opatch apply
...
OPatch succeeded.
Troubleshooting
There are some common problems with the installation. Many are the result of missing a step in the
preinstallation setup.
Kernel parameters errors, often the result of typographic errors, can cause ORA-3113 End of
File on communication channel, or an OS error indicating that there is not enough shared
memory or semaphores. These usually show up when the Database Creation Assistant tries to start an
instance.
Environment variables can lead to a variety of errors again due to incorrect typing. The directory to
be used as the ORACLE_BASE must exist and be owned by the user that is running the installer. The
C compiler and supporting utilities must be installed and the directory must be included in the PATH
environment variable.
The OUI is very convenient to use. The error pop-up windows normally allow you to go to another
window, fix the problem, and click Retry. The only errors that cannot be fixed in this manner are the
environment variables. If you discover that the problem is with environment variables, then you must
exit the OUI, change the variable, and restart the installer. The installer inherits the environment
when it starts, and the installer environment cannot be changed while the installer is running.
Disk Partitioning
Disk partitioning is not required by Linux or by Oracle software. However, disk partitioning
can reduce the time consumed in operations such as planning for recovery, upgrades, and
backups.
The larger the partition the longer a file system check (fsck) operation takes on bootup.
Much of this time can be avoided by using a journaled file system, such as ext3. Having a
small /boot partition that holds only the current kernels separates the tasks of booting the
machine and recovering a partition. A typical size of this partition is from 10 MB to 100 MB.
For Enterprise Linux, it is recommended that you have approximately 100 MB for the boot
partition.
The / partition should hold only a few directories that are created with the distribution
installation. These directories are overwritten with a new installation or upgrade. (Keep track
of any customized configuration files that you place under this partition, because they will also
be overwritten during a new installation or upgrade.)
The /var/log and /tmp should be separate partitions to isolate the effects of runaway
processes. The /tmp partition should be at least 400 MB.
• Instance
• Disks
• Disk groups
• Files
# /etc/init.d/oracleasm configure
# /etc/init.d/oracleasm enable
Configuring Disks
Oracle recommends that you create a single whole-disk partition on each disk that you want to
use. Use either fdisk or parted to create a single whole-disk partition on the disk devices that
you want to use.
/sbin/fdisk /dev/sdc
The kernel may be unaware of the partition changes after partitioning the disks. The partprobe
command searches for partitions information and informs the kernel of partition table changes.
/sbin/partprobe
# /etc/init.d/oracleasm scandisks
$ su –
# /u01/app/oracle/product/10.2.0/db_1/bin/localconfig add
$ dbca
Creating a Database
You can use DBCA to create a database, either in silent mode or via a graphical interface. Using the
graphical interface, you answer about 12 questions, and then the creation process starts. Some of the
questions are regarding the name of the database, passwords for users, and the storage mechanism to
use.
$ cd $ORACLE_BASE/admin
$ ls
+ASM orcl
orcl:
total 24
drwxr-x--- 2 oracle oinstall 4096 Sep 26 06:05 adump
drwxr-x--- 2 oracle oinstall 4096 Sep 26 05:39 bdump
drwxr-x--- 2 oracle oinstall 4096 Sep 26 05:16 cdump
drwxr-x--- 2 oracle oinstall 4096 Sep 26 05:16 dpdump
drwxr-x--- 2 oracle oinstall 4096 Sep 26 05:26 pfile
drwxr-x--- 2 oracle oinstall 4096 Sep 26 05:39 udump
The portlist.ini file contains all of the port number information for
Oracle-related applications, including:
• iSQL*Plus
• Enterprise Manager Console
$ cd $ORACLE_HOME/install
$ cat portlist.ini
iSQL*Plus HTTP port number =5560
Enterprise Manager Console HTTP Port (orcl) = 1158
Enterprise Manager Agent Port (orcl) = 3938
$ cd $ORACLE_HOME/install
$ cat readme.txt
The following J2EE Applications have been deployed and
are accessible at the URLs listed below.
iSQL*Plus URL:
http://EDRSR9P1:5560/isqlplus
Miscellaneous Files
The /etc/oraInst.loc file contains a pointer to the Oracle software inventory directory. It also
specifies the Linux group name that owns the Oracle software. This directory gets created
automatically when using the Oracle Universal Installer (OUI), but it must be created and populated
manually if you are performing a silent installation, and it does not already exist. If Oracle products
have already been installed on the system, then it exists already.
The oraenv script, located in the $ORACLE_HOME/bin directory, sets up your Oracle
environment for you based on the value of ORACLE_SID. This is useful to run if you need to switch
from one database to another on the same system.
The dbhome script reports the current Oracle home directory.
Background Processes
There are several processes that make up the executable part of an Oracle instance. You can list these
in Linux by issuing the ps command. Just listing all processes will be too much information, if you
want to see only the processes that belong to a single instance. So, you can use grep to filter the
information out. All the database instance process names start with the string ora_ and end with the
string _<SID>. So, if you are interested in the orcl instances processed, grepping as shown in the
slide shows you that.
The ASM instance process names all start with _asm and end with the string _<SID>. So, the
following command shows you the processes for the +ASM instance:
$ ps -eo pid -o command | grep _+ASM | grep asm_
6985 asm_pmon_+ASM
6987 asm_psp0_+ASM
6989 asm_mman_+ASM
6991 asm_dbw0_+ASM
6993 asm_lgwr_+ASM
.
.
.
SPID PROGRAM
------ ----------------------
3908 oracle@EDRSR9P1 (LGWR)
3906 oracle@EDRSR9P1 (DBW0)
$ sqlplus hr/hr@orcl
SQL>
$ sqlplus hr/hr
SQL>
Server Processes
In an instance that is running in dedicated server mode, there is a one-to-one correspondence of
session to server process. A server process is the process that gets started that is dedicated to
accessing the data files on behalf of a database session, with the purpose of executing commands and
returning results.
On Linux, all the server processes are named with this pattern: “oracle<SID>”. So, in this case,
where the instance name is orcl, all the server process names are oracleorcl. You can see
server processes appear and disappear from the Linux process list, as sessions are created and
disconnected.
In the example in the slide, there are two sessions. One was started with a service name specifier,
which resulted in process 4422 being spawned. Note that the argument says LOCAL=NO. That is
because it came through the listener. Then another SQL*Plus session was established, this time
without using a service name. The server process for that session is 5112, and the argument indicates
that LOCAL=YES.
Note: For shared server mode, there are other processes involved, called dispatchers, and there is no
longer a one-to-one correspondence of session to server process. For more information about shared
server mode, refer to the Oracle Database Administrator's Guide.
Process Hierarchy
When one process starts another process, the first process is referred to as the parent, and the
spawned process is referred to as the child process. This can be observed in Oracle Database
processes if you list the parent process IDs in Linux.
The example in the slide shows the session details for a particular session that was created using
SQL*Plus. So, the database records the client process ID as 18556, which is the SQL*Plus process.
This process started the dedicated server process, identified by a process ID of 18557. So, in the ps
listing, the SQL*Plus process is the parent of the server process. You can continue to follow the
process hierarchy until you get to the init process. The SQL*Plus program was invoked from inside
the bash shell (running in the Gnome windowing system), which was invoked by the init process,
which started the operating system at boot time.
Kernel
BIOS
init /etc/rc.d/rc.sysinit
rc1.d
• Runlevels 1 through 5 are
available on boot. rc2.d
• Runlevel 0 is the system
halt state, entered from rc3.d
runlevels 1–5 before
shutdown. rc4.d
• Runlevel 6 initiates a
system restart. rc5.d
rc6.d
# init 3
Linux Runlevels
Linux uses runlevels to determine which services to start or stop. For example, at certain
runlevels the graphic user interface might not be started. The meaning and services started at the
different runlevels is distribution dependent, but for Enterprise Linux it is as follows:
• Runlevel 0: Halt. It is used to shut down the system.
• Runlevel 1: Single-user (maintenance) mode. Only root may log in.
• Runlevel 2: Multiuser mode, text-based console only
• Runlevel 3: Multiuser mode with networking
• Runlevel 4: For custom use
• Runlevel 5: The default mode; multiuser with networking and active X session
• Runlevel 6: Reboot. This shuts everything down and then starts it back.
Booting to a Nondefault Runlevel
The GRand Unified Bootloader (GRUB) provides a way to specify the runlevel on startup from
the OS selection menu. For Enterprise Linux, you press [E] then scroll down to select the kernel,
press [E] again, press the spacebar, enter the runlevel you want to boot to, then press [B].
Your current and previous runlevel can be found with the runlevel command.
Administrative Scripts
Oracle provides the dbstart and dbshut scripts that must be modified for your site. These
scripts read the oratab file, determine which databases participate in automated startup or
shutdown, and start up or shut down these databases.
The oratab file is created by root.sh and updated by the Database Configuration Assistant
(DBCA) when creating a database.
A colon, “:”, is used as the field terminator. A new line terminates the entry. Lines beginning
with a pound sign, “#”, are comments.
Entries are of the form:
$ORACLE_SID:$ORACLE_HOME:<N|Y>
The first and second fields are the system identifier and home directory of the database,
respectively. The third field indicates to the dbstart utility whether the database should be
brought up at system boot time, indicated by Y or N, meaning yes or no, respectively.
Example of the entries in /etc/oratab :
+ASM:/u01/app/oracle/product/10.2.0/db_1:Y
orcl:/u01/app/oracle/product/10.2.0/db_1:Y
ASM and database instances created without using DBCA will not appear in the oratab file.
Entries will need to be made manually in the file for those instances.
rc6.d
Automating Jobs
Much of your work as a DBA must be done outside normal working hours, or at least during off-
peak hours. In many companies, you are asked to support more databases and more hours of the
day or night. Automating routine jobs is a way of reducing the work load. These routine jobs
include:
• Doing backups and maintaining backup scripts
• Starting up and shutting down the database
• Gathering statistics on the database and OS
• Monitoring the health of the database and related processes
Ensure that the database starts up when the machine boots, and more importantly make sure that
the database does a proper shutdown before the machine is shut down. Monitor the database and
related processes so that you are aware of a problem before the users start contacting you.
OS Scheduling Tools
Linux offers several scheduling tools that usually are a variation of the standard cron and at
tools, or a GUI for one of these tools. Task Scheduler is one of these user interfaces that makes
cron and at easier to use.
cron
The system cron daemon is started as part of the boot process. Security for cron is handled by
the system administrator by using the /etc/cron.allow and /etc/cron.deny files.
Only users listed in the /etc/cron.allow file are allowed to use cron, users listed in the
/etc/cron.deny file cannot use cron. If neither of these files exist, only the superuser is
allowed to use cron. The cron daemon checks the system-level schedule in the
/etc/crontab file and user-level schedules in the /var/spool/cron directory. In
Enterprise Linux, there is also a set of directories /etc/cron.hourly,
/etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly that hold
executable scripts. If the system administrator puts an executable script into one of these
directories, then cron runs the script at the appropriate interval.
Database Backup
Database Backups
Backups are required by most businesses. The method and the frequency are often dictated by
business requirements. The method should be chosen based on availability requirements,
convenience, performance, and mean time to recover (MTTR).
User-Managed Backups with OS Tools
Backups can be created by using standard OS tools, such as tar, cpio, and cp. It is the user’s
responsibility to understand the limitations of these tools.
User-Managed Backups with Third-Party Tools
There are several third-party backup solutions. The most important consideration is whether the
tool you choose coordinates with the Oracle database. The database must be shut down, or the
tablespaces put into backup mode before the back up takes place. Many vendors provide an
Oracle database agent module to provide this functionality.
Server-Managed Backups with RMAN
Oracle Corporation provides an integrated tool called Recovery Manager (RMAN) to perform
backups or use the DB Console interface to RMAN. This tool works with the database instance
to make reliable backups.
Note: Use the full pathname to invoke RMAN: $ORACLE_HOME/bin/rman. By doing this,
you avoid any path problems. There is a different Linux utility named rman.
Oracle Database 10g: Managing Oracle on Linux for DBAs 7 - 17
Backing Up with RMAN
Swap Space
Often there are demands for more memory than what is actually installed on the machine. To
mitigate the effects of this limitation, swap space is set up. Swap space is an area of disk that is set
aside to hold some of the contents of physical memory, or RAM, at certain times. As Linux switches
from running one process to another, it may need to allocate some memory in RAM to support the
new process. If there is not enough RAM available, instead of failing the process, Linux copies the
contents of some of the RAM to the swap space. This process of copying memory contents from
RAM to disk and back again is why this disk area is referred to as swap space; Linux continually
swaps the contents of physical RAM and disk, as needed. The result is that there is effectively more
memory available for processes to use.
Swap
RAM Swap space
cache
Swap Cache
A closer look at the RAM/swap space interaction reveals the swap cache mechanism. The swap
cache is an area of RAM that is set aside to track the pages that are swapped out to disk, and, whether
they have been written to since that time. The purpose of this is to short-circuit any unnecessary I/O.
Consider the page out operation, where a page is swapped out to disk, and then later, is needed back
in memory. If, while it was swapped out onto disk, the in-memory page was not reused for anything
else, then there is no need to read the page back in from disk; it is already in memory. Likewise, if a
page is swapped in, and then swapped out, the swap out operation may not require that the page be
written to disk again. If the page had not been modified at all while it was swapped in, then its
contents still match what is already stored on disk, and thus, there is no need to write it to disk again.
The details of the processes that write a page out and read a page in follow.
Page Out Operation
If a page residing in memory is not used, it ages. The older a page, the more likely it is to be
overwritten. The swap looks for the oldest pages in each process every few seconds. If there is a
demand for free pages, swap overwrites the old pages. These pages are moved to the swap cache in
memory if they are dirty, or discarded if they are not dirty. If a page is needed again before being
moved from the cache to disk, then it is reloaded from the cache. Otherwise, it gets flushed from the
cache to the disk after a period of time.
1 GB 1.5 GB
RAM Swap
4 GB 4 GB
RAM
= Swap
7.5 GB
10 GB Swap
Total = 2.5 GB RAM
Total = 8 GB
Total = 17.5 GB
/proc/meminfo
The /proc/meminfo virtual file shows the current state of memory for the Linux kernel. The
following information can be found there:
• MemTotal: The total physical memory in the system. All of your server's physical memory
should be reported here at all times.
• MemFree: Total amount of memory that is currently free
• Cached: Memory in the pagecache (diskcache) minus SwapCached. This does not include
SwapCache.
• SwapCached: Memory that once was swapped out and has been swapped back in, but is still in
the swap file. This is a performance feature because if memory is needed it does not need to be
swapped out again; it is already in the swap file. This saves I/O.
Some terminology:
• Page table: A data structure that maps virtual
addresses to physical addresses
• Translation Lookaside Buffer (TLB): Cache in the CPU
that contains part of the page table, for performance
• Hugetlb: An entry in the TLB that point to a hugepage
• Hugetlbfs: An in-memory file system introduced in the
2.6 Linux kernel
• Page Address Extentions (PAE): A technique for
adding 4 bits onto the memory address value, allowing
a 32-bit machine to address 64 GB instead of 4 GB
2^32 = 4,294,967,296
4 GB
4 MB
4 KB
4 KB 4 MB
4 KB 4 MB
Page table
Page table
Hugepages
The regular page size in Linux is 4 KB. That means each entry in the page table can point to a 4 KB
size page. A smaller page size like this not only limits the amount of memory that can be addressed,
but it also causes more overhead in the management of the page table entries because there are more
entries required.
If you increase the page size to 4 MB, this provides for a lot more addressable memory, without the
extra overhead, because the number of page table entries can remain the same. This is referred to as
hugepages. Also, hugepages are not swapped out, and thus provide for better performance because
they remain in physical memory.
4.0 GB
OS Disk Buffer
Kernel use and Page cache
2.98 GB
1.25 GB
Shared libraries lib*.so
1.0 GB mapped_base
4.0 GB
OS Disk Buffer
Kernel use and Page cache
2.98 GB
336 MB
Shared libraries lib*.so mapped_base
256 MB
Oracle Executables,
Application code
128 MB
user processes
Kernel (not paged)
$ cd $ORACLE_HOME/rdbms/lib
$ cp ksms.s ksms.s_orig /* if ksms.s exists,
back it up first.*/
$ genksms -s 0x15000000 > ksms.s
hugemem Kernel
The hugemem kernel is provided by the kernel-hugemem package. It is a kernel that was
developed to take advantage of memory greater than 16 GB. It can even provide some benefit on
systems with as little as 6 GB of memory. There is no definite RAM size that defines where the
cutoff of benefit is, but if you have processes that can take advantage of up to 4 GB of address space,
then this kernel may provide some advantage. That is because the hugemem kernel not only
provides for addressing up to 64 GB of memory, but it also provides up to 4 GB of memory per
process. So even if you have only 12 GB, but you have a process that requires a large amount of
memory for itself, then the hugemen kernel can be beneficial.
Linux Tools
Graphical Measurement Tools
For Linux systems running the X Window System (X), there are a variety of graphical tools that
can be installed to monitor performance. They have the advantage of being easy to interpret.
X-Operating System View: xosview
The xosview tool shows CPU, disk, memory, and network activity. SuSE and UnitedLinux
install xosview by default. The package for xosview is distributed with Enterprise Linux,
but is not part of the default installation.
Note: Remember that X-based tools tend to influence the output by their own load on the CPU;
because of this, they may not be the most accurate measurements of CPU usage.
Virtual Process File System: /proc
The /proc file system is a virtual file system that provides a look into the background
workings of the operating system. You can think of it as the Linux equivalent to Oracle’s v$
views. The /proc file system exists in many UNIX variants, whereas its contents are greatly
expanded under Linux.
Free Memory: free
The free command displays the total amount of free and used physical and swap memory in
the system, as well as the shared memory and buffers used by the kernel.
Oracle Database 10g: Managing Oracle on Linux for DBAs 9 - 6
Linux Tools (continued)
Process Tree: pstree
This tool displays the relationship between processes in a tree structure, with parents to the left
and children to the right:
$ pstree
init-+-2*[Xvnc]
|-apmd
|-oafd
|-26*[oracle]
|-perl-+-emagent---emagent---4*[emagent]
`-java---java---52*[java]
Because the init process is system process number 1, it will always appear to the left. In this
snippet from the pstree output, you can see that there are two Xvnc processes running, 26
Oracle process running, and that the perl process has started two children, each of which has
started several other processes.
Resource Limits: ulimit
This tool shows the resources available to your shell. This tool is shell dependent; with the C
shell you would use limit –h instead.
/> bash
# ulimit –a (the –a flag shows all current limits)
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) 4
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 7168
virtual memory (kbytes, -v) unlimited
And many more
If it is measurable, chances are that there is a Linux package already written to measure it. Many
more utilities may be available depending on your distribution and which packages you have
installed. Consult your distribution-specific documentation for more information. Also, Oracle
provides Oracle Enterprise Manager DB Control, which provides a great deal of performance-
monitoring information about the systems it is monitoring.
CPU I/O
Memory
Network
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
1973 root 25 0 23480 9872 956 R 59.4 2.5 4:28 0 X
2487 oracle 15 0 29596 24M 1576 S 2.5 6.6 1:41 0 jre
2816 susan 15 0 21768 19M 1036 S 2.3 5.1 0:15 0 jre
2126 oracle 16 0 1080 1048 420 S 2.1 0.2 0:56 0 top
3218 root 16 0 1504 1504 860 R 2.1 0.3 0:02 0 top
3028 oracle 25 10 2820 2636 2052 R N 1.5 0.6 0:06 0 sproingies
2061 oracle 15 0 5852 2168 1100 S 0.9 0.5 0:08 0 rhn-applet-gui
1675 root 15 0 1800 728 540 S 0.7 0.1 0:12 0 snmpd
2602 oracle 15 0 8284 6452 1552 S 0.7 1.6 0:30 0 dbsnmp
1426 root 15 0 200 164 116 R 0.5 0.0 0:08 0 syslogd
2329 root 15 0 25876 9M 824 R 0.5 2.6 6:52 0 X
15 root 15 0 0 0 0 SW 0.3 0.0 0:09 0 kjournald
3008 root 15 0 12612 11M 7864 S 0.3 3.0 0:04 0 rhn-applet-gui
2212 root 15 0 26124 10M 812 R 0.1 2.7 9:01 0 X
2262 rlowenth 15 0 5836 2144 1084 S 0.1 0.5 0:12 0 rhn-applet-gui.
Many people do not realize that top is an interactive utility. From within top, you can enter
“?” to see a list of options. One more useful interactive command is “u” which allows you to
identify a single user for the top display output. For example, if you enter “u” and then type in
“oracle” then your list of processes would include only processes started by oracle. The
top displays’ default refresh rate is every two seconds, but you can change that interval with the
“s” command. When top is run by root, you can kill processes directly from within top by
using the “k” command.
CPU Measurements
Although there are dozens of different CPU measurements that you can look at, the ones listed in
the slide are the most important. These have the same meaning but slightly different labels
depending on the tool that is used to view them.
CPU Idle Time
This indicates the amount of time that CPU is in an idle state. Idle time is labeled %idle in
sar, id (under the CPU heading) in vmstat, and idle in top. If you are experiencing
performance problems, and CPU shows high idle time, then the system is not CPU bound.
CPU Time Spent Executing User and System Codes
These statistics have to be considered together. Together they represent the useful work that is
being done by the CPU. User time indicates the time that is spent in executing code in the user
space. System time measures the time that is spent in executing system calls. Every application
has a characteristic ratio of user to system. With an Oracle database, you should expect a ratio of
approximately 60% user code to 40% system code. If the user code percent is significantly
higher, then there may be inefficiencies in the application code. If the system code percentage is
significantly higher, look for system activity, such as high rates of disk I/O or memory swap.
# vmstat 2 5
procs system cpu
r b w … in cs us sy id
0 0 0 … 11 42 3 1 96
1 0 0 … 106 141 0 1 99
0 0 0 … 109 134 0 0 100
1 0 0 … 103 146 7 1 92
0 0 0 … 107 125 0 0 100
Note: Memory and IO statistics removed for readability
Page out
Page in
#top
#free
#cat /proc/meminfo
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
1017 root 15 0 5944 5544 4360 S 0.7 1.0 2:10 Xvnc
20866 oracle 15 0 86360 67M 54228 S 0.5 13.4 0:10 java
1212 root 16 0 1188 1188 832 R 0.3 0.2 0:00 top
1 root 15 0 508 460 460 S 0.0 0.0 0:04 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd
All the utilities that collect statistics gather information from the /proc file system. The virtual
meminfo “file” gives an instantaneous snapshot of memory usage.
# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 1054806016 1029685248 25120768 0 136949760 617668608
Swap: 534601728 22032384 512569344
MemTotal: 1030084 kB
MemFree: 24532 kB
MemShared: 0 kB
Buffers: 133740 kB
Cached: 585920 kB
SwapCached: 17272 kB
Active: 674312 kB
ActiveAnon: 214892 kB
ActiveCache: 459420 kB
Inact_dirty: 0 kB
Inact_laundry: 159476 kB
Inact_clean: 14016 kB
Inact_target: 169560 kB
HighTotal: 130524 kB
HighFree: 2032 kB
LowTotal: 899560 kB
LowFree: 22500 kB
SwapTotal: 522072 kB
SwapFree: 500556 kB
HugePages_Total: 0
HugePages_Free: 0
Hugepagesize: 4096 kB
#sar -B 2 3
#sar -R 2 3
sar
The System Activity Reporter (sar) has many options.
The -B option shows pages in and pages out per second (pgpgin/s, pgpgout/s); the
number of active pages (activepg); and the inactive dirty, inactive clean, and inactive target
pages (inadtypg, inaclnpg, inatarpg). The inactive target is the demand rate.
# sar -B 2 3
Linux 2.4.9-e.3 (EDD1R28P1) 02/26/2003
vmstat
The vmstat command has two parameters: frequency and count. Frequency is the number of
seconds between samples, and count is the number of samples to take.
In this example, vmstat samples every two seconds for a total of five samples. The first output
line of vmstat shows all the activity since the last reboot. The following example shows a
loaded system, but with little swapping. Pages in are represented by “si” and pages out are
shown with “so.”
Note: The first line of output includes all activity since the last vmstat report.
# vmstat 2 5
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
1 0 0 296004 3092 73268 61324 0 3 11 51 135 588 10 4 86
1 0 0 296004 3076 73268 61324 0 0 0 2 109 372 66 34 0
1 0 0 296004 3076 73268 61324 0 0 0 32 109 199 59 41 0
1 0 0 296004 3072 73268 61316 0 0 0 24 110 331 40 60 0
1 0 0 296004 3076 73268 61300 0 0 2 10 107 373 51 49 0
Is I/O a bottleneck?
• Take and retain baseline I/O measurements.
• Individual measurements (snapshots) are usually of
little value. I/O monitoring should be done over time.
• I/O issues with the swap partition should be treated as
memory problems first.
Is I/O a Problem?
With databases, disk I/O is a primary concern. Disk I/O is the single largest performance cost.
Most I/O measurements are meaningless without a baseline, so ensure that you capture and
retain that. When you notice performance degradation, look for high queue and wait statistics to
determine whether I/O is the culprit.
Remember that virtual memory is maintained on disk, and will also show up as I/O. If you spot
high I/O statistics on the swap partition, you should troubleshoot it as a memory issue first, then
address it as an I/O issue.
I/O Measurements
I/O measurements usually concentrate on either the volume of data being handled or the speed
with which that data is handled.
Volume
Obviously, the more work that is being done the longer it will take to do that work. When you
measure volume, look at read and write volume separately. Also, if your tools allow you to view
it, monitor sequential operations as compared with random access. It is usually quicker to write
1 MB of sequential data than to write 20 KB in multiple locations on the disk because of the
time required for disk head positioning.
Speed
When monitoring speed, look for transfer rate (bytes per second), wait time (how much time was
spent waiting for I/O to respond), and service time (how long does each I/O operation take from
start to finish). Of these three, the wait time is the most important from a database performance
standpoint.
#iostat -d -p 2 2
Linux 2.6.9-22.EL (EDRSR24P1) 10/27/2006
DB_WRITER_PROCESSES:
• These processes are set to no more than CPU_COUNT,
up to 20.
• Multiple DBWR processes write from LRU to disk.
• These processes can use asynchronous I/O.
• These processes are best used in OLTP environments.
LRU lists
DBWR processes
DBWR_IO_SLAVES:
• Used to simulate asynchronous I/O
• One DBWR, multiple writers to disk
Buffer
cache DBWR
I/O slaves
DB Writer Slaves
On systems without asynchronous I/O capability, the database instance provides a way to simulate
asynchronous I/O with the DBWR_IO_SLAVES.
Set this parameter to no more than two times the number of disks that the database is spread across.
When increasing DBWR_IO_SLAVES, consider the number of disk controllers and monitor the disk
I/O queues for contention. Also, when increasing the number of slaves, add only a few at a time,
because each of these processes has an overhead cost. If you add too many at once, the overhead of
all the processes may outweigh the benefit of the additional processes. Each slave must have a
communication area for I/O buffers. This area is taken from the large pool if it is configured, or from
the shared pool if the large pool does not exist. Turning on DBWR_IO_SLAVES forces
DB_WRITER_PROCESSES to one, and sets the number of slave processes used by the ARCH and
LGWR processes to four. Even though I/O slaves are generally used to simulate asynchronous I/O, the
slaves use asynchronous I/O if it is turned on.
Note: The use of multiple DBWR processes and I/O slaves are mutually exclusive.
Buffer cache
Buffer cache
Example:
Large pool
Large pool
Choosing a Kernel
The Linux installers tend to make good choices about which kernel to install, and install the kernel
that uses your hardware and gives a good performance. If you have to choose which kernel to use,
follow these guidelines.
For machines with more than 4 GB of physical memory, an enterprise kernel is required. The
enterprise kernel includes the Physical Address Extensions (PAE) that are required to address
more than 4 GB of physical memory. This includes a three-level memory page table, so if you have
less than 4 GB of physical RAM, then the three-level page table is an extra overhead that is not
required.
For multiprocessor machines, the installer chooses an smp kernel. The smp kernel has the required
extensions to handle scheduling processes on more than one CPU.
The uniprocessor kernel handles up to 4 GB of physical memory on a single processor. An example
of the uniprocessor name is 2.4.9-e.3. (Note that there is no special designator.)
An enterprise or smp kernel may be used on a uniprocessor machine, but has extra overhead
associated with the unused features.
OS Watcher
OS Watcher (OSW) is a collection of shell scripts intended to collect and archive operating
system and network metrics to aid support of generic performance issues. OSW operates as a set
of background processes on the server and gathers OS data on a regular basis, invoking such
utilities as vmstat, netstat and iostat.
Note: For more details, refer to the OS Watcher user’s guide in MetaLink note 301137.1. The
MetaLink note contains the link for downloading OS Watcher.
Installing OSWatcher
Installation is very simple. Untar the osw.tar file, which creates the directory structure under
the current directory. The parent directory of all the extracted files is called osw. At that point,
the software is installed.
#########################################################
# CONFIGURATION Determine Host Platform
#########################################################
case $PLATFORM in
Linux)
IOSTAT='iostat -x 1 3'
VMSTAT='vmstat 1 3'
TOP='eval top -b -n 1 | head -50'
PSELF='ps -elf'
MPSTAT='mpstat 1 3'
MEMINFO='cat /proc/meminfo'
SLABINFO='cat /proc/slabinfo'
Configuring OSW
OSWatcher can be configured by editing the CONFIGURATION section of the
OSWatcher.sh script. Here, the commands, along with their arguments, are listed. You can
edit these here to cause the behavior and output of the commands to change. The default setting
for the Linux platform is shown in the slide.
For example, the output from iostat with the –x option, as shown above, shows the statistics
of the hard drive but not the partitions. You could change the –x option to –p /dev/hda and
the output would show the individual partitions.
$ ./startOSW.sh 30 24
Testing for discovery of OS Utilities... Snapshot
VMSTAT found on your system. interval
IOSTAT found on your system. in seconds
MPSTAT found on your system.
NETSTAT found on your system.
TOP found on your system. Hours
Discovery completed. to run
Starting OSWatcher V1.3.2 on Sat Sep ...
With SnapshotInterval = 30
With ArchiveInterval = 24
Starting Data Collection...
osw:Sat Sep 30 13:50:30 PDT 2006
osw:Sat Sep 30 13:51:01 PDT 2006 $ ./stopOSW.sh
osw:Sat Sep 30 13:51:31 PDT 2006
Terminated
Running OSW
To run OSW, simply invoke the startOSW.sh script:
startOSW.sh <snapshot_interval> <hours>
where the snapshot_interval is the number of seconds there should be between each data sample,
and hours is the number of hours for which to collect data. Even if the number of hours has not
been reached, you can run the stopOSW.sh script to stop it at any time.
In the example in the slide, data is being sampled every 30 seconds, and the script will run for 24
hours. But, after three samples have been taken (90 seconds have passed) the stopOSW.sh
script is run to stop the data collection.
$ ls
oswiostat oswmpstat oswprvtnet oswslabinfo oswvmstat
oswmeminfo oswnetstat oswps oswtop
$ head -4 oswmeminfo/EDRSR9P1_meminfo_09.30.06.1300.dat
zzz ***Sat Sep 30 13:50:31 PDT 2006
MemTotal: 1035320 kB
MemFree: 451920 kB
Buffers: 1020 kB
Under each of those directories are the files that contain the collected data. The names of those
files follow this pattern for organizational purposes:
<hostname>_<command>_<timestamp>.dat
$ gunzip rda_4.5-060912.tar.gz
$ tar -xf rda_4.5-060912.tar
$ cd rda
$ ls -gG
total 112
-r--r--r-- 1 4087 Jul 6 01:58 DISCLAIM.txt
drwxr-xr-x 2 4096 Sep 30 12:07 hcve
drwxr-xr-x 2 4096 Sep 30 12:07 modules
drwxr-xr-x 4 4096 Sep 30 12:07 RDA
-r-xr-xr-x 1 3415 Apr 25 05:50 rda.cmd
-r-xr-xr-x 1 40066 Sep 8 08:03 rda.pl
-r-xr-xr-x 1 3442 Apr 25 05:50 rda.sh
-r--r--r-- 1 13103 Aug 4 06:41 README_Unix.txt
-r--r--r-- 1 14539 Aug 4 06:41 README_VMS.txt
-r--r--r-- 1 13101 Aug 4 06:41 README_Windows.txt
Prompts with
several questions Gathers
data
$ ./rda.sh –S
.
.
$ ./rda.sh -v
Collecting diagnostic data ...
---------------------------------------------------------
RDA Data Collection Started 22-Sep-2006 11:59:37 AM
---------------------------------------------------------
Processing Initialization module ...
Processing CFG module ...
Processing Sampling module ...
Processing OCM module ...
Processing OS module ...
• Diagnostic utility
• Records system calls
• Trace a specified command
• Trace an existing process
$ strace –aef –Ttt –o /tmp/date.out date
strace
Strace is a utility that intercepts and records the system calls, which are called by a process, and
the signals, which are received by a process. The name of each system call, its arguments and its
return value are printed on standard error or to the file specified with the -o option.
strace can trace running background processes or be used when executing a program.
Using a simple shell script, you can see a sample of the output from strace. Script
helloworld.sh invokes a bash shell, echoes “Hello World!,” and then exits.
helloworld.sh shell script:
#!/bin/bash
echo “Hello World!”
exit 0
The shell script is executed using strace:
strace –o /tmp/helloworld.out ./helloworld.sh
ORA-600
ORA-600 errors are raised from the kernel code of the Oracle Database software when an
internal inconsistency is detected or an unexpected condition is met. This situation is not
necessarily a bug, it might be caused by problems with the operating system, lack of resources,
hardware failures, or other conditions.
With the ORA-600 error comes a list of arguments in square brackets. The first of these
arguments tells us from where in the code the error was caught and thus is the key information in
identifying the problem. This argument is either a number or a character string. The remaining
arguments are used to supply further information such as values of internal variables.
When an ORA-600 error is raised a trace file is generated in either USER_DUMP_DEST or
BACKGROUND_DUMP_DEST depending on whether the error was caught in a user or a
background process. The error is also written in the alert log with the name of the trace file. The
trace file contains vital information about what led to the error condition.
ORA-7445 Error
An ORA-7445 error is raised by an Oracle server process when it has received a fatal signal
from the operating system. The error may be raised in either a foreground or background
process. The process will normally write an error to the alert log, write a trace file in either
USER_DUMP_DEST or BACKGROUND_DUMP_DEST, and create a core dump in
CORE_DUMP_DEST.
There are many “illegal” operations the operating system can trap; a common example is a
process writing to an invalid memory location. To protect the system, the offending process will
be sent a fatal signal. Typically, the signals seen are SIGBUS (signal 10, bus error) and
SIGSEGV (signal 11, segmentation violation).
There are other Linux signals and exceptions that may happen; however, those are likely caused
by an OS program rather than a problem in the Oracle database.
Examples of other signals are SIGINT, SIGKILL, and SIGSYS.
An ORA-7445 is a generic error, and can occur from anywhere in the Oracle code. The precise
location of the error is identified by the trace file it produces.
Copy the call stack trace from the trace file into the service request filed with Oracle Support.
Doing so will assist Support in resolving the issue.