RAC Questions and Answers
RAC Questions and Answers
RAC Questions and Answers
Oracle RAC one Node is a single instance running on one node of the
cluster while the 2nd node is in cold standby mode. If the instance
fails for some reason then RAC one node detect it and restart the
instance on the same node or the instance is relocate to the 2nd node
incase there is failure or fault in 1st node. The benefit of this
feature is that it provides a cold failover solution and it automates
the instance relocation without any downtime and does not need a manual
intervention. Oracle introduced this feature with the release of 11gR2
(available with Enterprise Edition).
What are the advantages of RAC (Real Application Clusters)?
Reliability - if one node fails, the database won't fail.
Availability - nodes can be added or replaced without having to shut
down the database.
Scalability - more nodes can be added to the cluster as the workload
increases
What is Oracle RAC One Node?
Oracle RAC one Node is a single instance running on one node of the
cluster while the 2nd node is in cold standby mode. If the instance
fails for some reason then RAC one node detect it and restart the
instance on the same node or the instance is relocate to the 2nd node
incase there is failure or fault in 1st node. The benefit of this
feature is that it provides a cold failover solution and it automates
the instance relocation without any downtime and does not need a manual
intervention. Oracle introduced this feature with the release of 11gR2
(available with Enterprise Edition).
What is Cache Fusion?
Oracle RAC is composed of two or more instances. When a block of data
is read from datafile by an instance within the cluster and another
instance is in need of the same block, it is easy to get the block
image from the instance which has the block in its SGA rather than
reading from the disk. To enable inter instance communication Oracle
RAC makes use of interconnects. The Global Enqueue Service (GES)
monitors and Instance enqueue process manages the cache fusion.
What command would you use to check the availability of the RAC system?
crs_stat -t -v (-t -v are optional) Until 11.1
OR
crsctl check cluster all
11.2
How do we verify that RAC instances are running?
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER
column,host_:instancename under INST_NAME column.
A node must be able to access more than half of the voting disks at any
time.
For example, if you have 3 voting disks configured, then a node must be
able to access at least two of the voting disks at any time. If a node
cannot access the minimum required number of voting disks it is
evicted, or removed, from the cluster.
Oracle Cluster Registry (OCR) NISA
The cluster registry holds all information about nodes, instances,
services and ASM storage if used, it also contains state information ie
they are available and up or similar.
The OCR must reside on shared disk that is accessible by all of the
nodes in your cluster.
What are the administrative tasks involved with voting disk?
Following administrative tasks are performed with the voting disk :
1) Backing up voting disks
2) Recovering Voting disks
3) Adding voting disks
4) Deleting voting disks
5) Moving voting disks
Can you add voting disk online? Do you need voting disk backup?
Yes, as per documentation, if you have multiple voting disk you can
add online, but if you have only one voting disk , by that cluster will
be down as its lost you just need to start crs in exclusive mode and
add the votedisk using
crsctl add votedisk <path>
What is the Oracle Recommendation for backing up voting disk?
Oracle recommends us to use the dd command to backup the voting disk
with a minimum block size of 4KB.
How do we backup voting disks?
1) Oracle recommends that you back up your voting disk after the
initial cluster creation and after we complete any node addition or
deletion procedures.
2) First, as root user, stop Oracle Clusterware (with the crsctl stop
crs command) on all nodes. Then, determine the current voting disk by
issuing the following command:
crsctl query votedisk css
3) Then, issue the dd or ocopy command to back up a voting disk, as
appropriate.
Give the syntax of backing up voting disks:On Linux or UNIX systems:
dd if=voting_disk_name of=backup_file_name
where,
Srvctl cannot start instance, I get the following error PRKP-1001 CRS0215, however sqlplus can start it on both nodes? How do you identify
the problem?
Set the environmental variable SRVM_TRACE to true.. And start the
instance with srvctl. Now you will get detailed error stack.
What are Oracle database background processes specific to RAC?
Oracle RAC is composed of two or more database instances. They are
composed of Memory structures and background processes same as the
single instance database.Oracle RAC instances use two processes
GES(Global Enqueue Service), GCS(Global Cache Service) that enable
cache fusion.Oracle RAC instances are composed of following background
processes:
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor
To ensure that each Oracle RAC database instance obtains the block that
it needs to satisfy a query or transaction, Oracle RAC instances use
two processes, the Global Cache Service (GCS) and the Global Enqueue
Service (GES). The GCS and GES maintain records of the statuses of each
data file and each cached block using a Global Resource Directory
(GRD). The GRD contents are distributed across all of the active
instances.
What is GRD? Cash for Data
GRD stands for Global Resource Directory. The GES and GCS maintain
records of the status of each datafile and each cahed block using
global resource directory. This process is referred to as cache fusion
and helps in data integrity.
What is ACMS?
ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC
environment ACMS is an agent that ensures a distributed SGA memory
update(ie)SGA updates are globally committed on success or globally
aborted in event of a failure.
What is SCAN listener?
A scan listener is something that additional to node listener which
listens the incoming db connection requests from the client which got
through the scan IP, it got end points configured to node listener
where it routes the db connection requests to particular node listener.
on which the VIP address can accept TCP connections but it cannot
accept Oracle connections.
Why do we have a Virtual IP (VIP) in Oracle RAC?
Without using VIPs or FAN, clients connected to a node that died will
often wait for a TCP timeout period (which can be up to 10 min) before
getting an error. As a result, you don't really have a good HA solution
without using VIPs.
When a node fails, the VIP associated with it is automatically failed
over to some other node and new node re-arps the world indicating a new
MAC address for the IP. Subsequent packets sent to the VIP go to the
new node, which will send error RST packets back to the clients. This
results in the clients getting errors immediately.
Give situations under which VIP address failover happens?
VIP addresses failover happens when the node on which the VIP address
runs fails; all interfaces for the VIP address fails, all interfaces
for the VIP address are disconnected from the network.
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to
the VIP address receive a rapid connection refused error .They don't
have to wait for TCP connection timeout messages.
What is the use of a service in Oracle RAC environment?
Applications should use the services feature to connect to the Oracle
database. Services enable us to define rules and characteristics to
control how users and applications connect to database instances.
What are the characteristics controlled by Oracle services feature?
The characteristics include a unique name, workload balancing, failover
options, and high availability.
What enables the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application
connections across all of the instances in an Oracle RAC database.
What are the types of connection load-balancing?
Connection Workload management is one of the key aspects when you have
RAC instances as you want to distribute the connections to specific
nodes/instance or those have less load.
There are two types of connection load-balancing:
1.Client Side load balancing (also called as connect time load
balancing)
2.Server side load balancing (also called as Listener connection load
balancing)
What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using the
below
SQL*PLUS,
DBCA,
NETCA
Name some Oracle Clusterware tools and their uses?
OIFCFG - allocating and deallocating network interfaces.
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.
OCRDUMP - Identify the interconnect being used.
CVU - Cluster verification utility to get status of CRS resources.
What is the difference between CRSCTL and SRVCTL?
crsctl manages clusterware-related operations:
Starting and stopping Oracle Clusterware
Enabling and disabling Oracle Clusterware daemons
Registering cluster resources
Event Services
High Availability
Network Management (provides DNS/GNS/MDNSD services on behalf of
other traditional services) and SCAN Single Access Client Naming
method, HAIP
Storage Management (with help of ASM and other new ACFS filesystem)
Time synchronization (rather depending upon traditional NTP)
Removed OS dependent hang checker etc, manages with own additional
monitor process
What is hangcheck timer?
The hangcheck timer checks regularly the health of the system. If the
system hangs or stop the node will be restarted automatically.
There are 2 key parameters for this module:
-> hangcheck-tick: this parameter defines the period of time between
checks of system health. The default value is 60 seconds; Oracle
recommends setting it to 30seconds.
-> hangcheck-margin: this defines the maximum hang delay that should be
tolerated before hangcheck-timer resets the RAC node.
State the initialization parameters that must have
instance in an Oracle RAC database?
Some initialization parameters are critical at the
time and must have same values.Their value must be
or PFILE for every instance.The list of parameters
identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT
The Voting Disk Files are used by Oracle Clusterware to determine which
nodes are currently members of the cluster. The voting disk files are
also used in concert with other Cluster components such as CRS to
maintain the clusters integrity.
Oracle Database 11g Release 2 provides the ability to store the voting
disks in ASM along with the OCR. Oracle Clusterware can access the OCR
and the voting disks present in ASM even if the ASM instance is down.
As a result CSS can continue to maintain the Oracle cluster even if the
ASM instance has failed.
How many voting disks are you maintaining ?
By default Oracle will create 3 voting disk files in ASM.
Oracle expects that you will configure at least 3 voting disks for
redundancy purposes. You should always configure an odd number of
voting disks >= 3. This is because loss of more than half your voting
disks will cause the entire cluster to fail.
You should plan on allocating 280MB for each voting disk file. For
example, if you are using ASM and external redundancy then you will
need to allocate 280MB of disk for the voting disk. If you are using
ASM and normal redundancy you will need 560MB.
Why we need to keep odd number of voting disks ?
Oracle expects that you will configure at least 3 voting disks for
redundancy purposes. You should always configure an odd number of
voting disks >= 3. This is because loss of more than half your voting
disks will cause the entire cluster to fail.
SCAN must resolve to at least one address on the public network. For
high availability and scalability, Oracle recommends configuring the
SCAN to resolve to three addresses.
What are SCAN components in a cluster?
1.SCAN Name
2.SCAN IPs (3)
3.SCAN Listeners (3)
What is FAN?
Fast application Notification as it abbreviates to FAN relates to the
events related to instances,services and nodes.This is a notification
mechanism that Oracle RAc uses to notify other processes about the
configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to
FAN events and take immediate action.
What is TAF?
TAF (Transparent Application Failover) is a configuration that allows
session fail-over between different nodes of a RAC database cluster.
Transparent Application Failover (TAF). If a communication link failure
occurs after a connection is established, the connection fails over to
another active node. Any disrupted transactions are rolled back, and
session properties and server-side program variables are lost. In some
cases, if the statement executing at the time of the failover is a
Select statement, that statement may be automatically re-executed on
the new connection with the cursor positioned on the row on which it
was positioned prior to the failover.
After an Oracle RAC node crashesusually from a hardware failureall
new application transactions are automatically rerouted to a specified
backup node. The challenge in rerouting is to not lose transactions
that were "in flight" at the exact moment of the crash. One of the
requirements of continuous availability is the ability to restart inflight application transactions, allowing a failed node to resume
processing on another server without interruption. Oracle's answer to
application failover is a new Oracle Net mechanism dubbed Transparent
Application Failover. TAF allows the DBA to configure the type and
method of failover for each Oracle Net client.
TAF architecture offers the ability to restart transactions at either
the transaction (SELECT) or session level.
What are the requirements for Oracle Clusterware?
1. External Shared Disk to store Oracle Cluster ware file (Voting Disk
and Oracle Cluster Registry - OCR)
2. Two netwrok cards on each cluster ware node (and three set of IP
address) Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter
node communication between rac nodes used by clusterware and rac
database)
IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for
client connection and for connection failover)
3. Storage Option for OCR and Voting Disk - RAW, OCFS2 (Oracle Cluster
File System), NFS, ..
Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application
connections across all of the instances in an Oracle RAC database.
How to find location of OCR file when CRS is down?
If you need to find the location of OCR (Oracle Cluster Registry) but
your CRS is down.
When the CRS is down:
Look into ocr.loc file, location of this file changes depending on
the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc
When CRS is UP:
Set ASM environment or CRS environment then run the below command:
ocrcheck
In 2 node RAC, how many NICs are r using ?
2 network cards on each clusterware node
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter
node communication between rac nodes used by clusterware and rac
database)
In 2 node RAC, how many IPs are r using ?
6 - 3 set of IP address
## eth1-Public: 2
## eth0-Private: 2
## VIP: 2
How to find IPs information in RAC ?
Edit the /etc/hosts file as shown below:
# Do not remove the following line, or various programs
# that requires network functionality will fail.
127.0.0.1
localhost.localdomain localhost
## Public Node names
192.168.10.11
node1-pub.hingu.net
node1-pub
192.168.10.22
node2-pub.hingu.net
node2-pub
## Private Network (Interconnect)
192.168.0.11
node1-prv
node1-prv
192.168.0.22
node2-prv
node2-prv
node1-nas
node2-nas
nas-server
node1-vip
node2-vip