HACMP Student Guide
HACMP Student Guide
HACMP Student Guide
cover
Front cover
HACMP System
Administration I: Planning and
Implementation
(Course code AU54)
Student Notebook
ERC 8.0
Student Notebook
Trademarks
IBM is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
AIX
BladeCenter
DS4000
Enterprise Storage Server
HACMP
POWER
Redbooks
System i5
System Storage
WebSphere
AIX 5L
Cross-Site
DS6000
General Parallel File
System
NetView
POWER5
Requisite
System p
Tivoli
Approach
DB2
DS8000
GPFS
Notes
pSeries
SP
System p5
TotalStorage
V4.0
Student Notebook
TOC
Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Course description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Unit 0. Course introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-1
Course objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-2
Course agenda (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-3
Course agenda (2 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-4
Course agenda (3 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-5
Course agenda (4 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-6
Course agenda (5 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-7
Lab exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-8
Student Guide font conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-9
Course overview summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0-10
Unit 1. Introduction to HACMP for AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.1 High Availability concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
High availability and HACMP concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
So, what is High Availability? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Eliminating single points of failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
High availability clusters (HACMP base) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
So, what about site failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
IBM's HA solution for AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
Fundamental HACMP concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
A highly available cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
HACMPs topology components (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15
HACMPs topology components (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
HACMP's resource components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20
What is HACMP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22
Additional features of HACMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-24
Some Assembly Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-26
Lets review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-28
1.2 What does HACMP do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-29
Topic 2 objectives: What does HACMP do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-30
Just What Does HACMP Do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31
What happens when something fails? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-32
What happens when a problem is fixed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-33
Standby (active/passive) with fallback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-34
Standby (active/passive without fallback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-36
Mutual takeover: Active/Active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-37
Concurrent: multiple active nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-39
Copyright IBM Corp. 1998, 2008
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Contents
iii
Student Notebook
HACMP Implementation
V4.0
Student Notebook
TOC
Contents
Student Notebook
HACMP Implementation
V4.0
Student Notebook
TOC
4-22
4-23
4-25
4-26
Contents
vii
Student Notebook
viii
HACMP Implementation
V4.0
Student Notebook
TOC
Contents
ix
Student Notebook
HACMP Implementation
V4.0
Student Notebook
TOC
Contents
xi
Student Notebook
xii
HACMP Implementation
V4.0
Student Notebook
TMK
Trademarks
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
AIX
BladeCenter
DS4000
Enterprise Storage Server
HACMP
POWER
Redbooks
System i5
System Storage
WebSphere
AIX 5L
Cross-Site
DS6000
General Parallel File
System
NetView
POWER5
Requisite
System p
Tivoli
Approach
DB2
DS8000
GPFS
Notes
pSeries
SP
System p5
TotalStorage
Trademarks
xiii
Student Notebook
xiv
HACMP Implementation
V4.0
Student Notebook
pref
Course description
HACMP System Administration I: Planning and Implementation
Duration: 5 days
Purpose
This course is designed to prepare students to install and configure a
highly available cluster using HACMP for AIX.
Audience
The audience for this course is students who are experienced AIX
system administrators with TCP/IP networking and AIX LVM
experience who are responsible for the planning and installation of an
HACMP 5.4.1 cluster on an IBM System p server running AIX 5L V5.3
or later (the lab exercises are conducted on AIX 6.1).
Prerequisites
Students should ideally be qualified as IBM Certified Specialists - p5
and pSeries Administration and Support AIX 5L and in addition have
TCP/ IP, LVM storage and disk hardware implementation skills. These
skills are addressed in the following courses (or can be obtained
through equivalent education and experience):
AU16: AIX 5L System Administration II: Problem Determination
AU07: AIX V4 Configuring TCP/IP
Objectives
After completing this course, you should be able to:
Course description
xv
Student Notebook
Curriculum relationship
This course should be taken before AU61
AU61: HACMP System Administration II: Administration and
Problem Determination
xvi
HACMP Implementation
V4.0
Student Notebook
pref
Agenda
Day 1
Welcome
Unit 1 - Introduction to HACMP for AIX 5L
Unit 2- Networking considerations for high availability
Exercise 1
Exercise 2
Day 2
Unit 3- Shared storage considerations for high availability
Unit 4 - Planning for applications and resource groups
Unit 5 - HACMP installation
Exercise 3
Exercise 4
Exercise 5
Day 3
Unit 6 - Initial cluster configuration
Exercise 6
Day 4
Unit 7 - Basic HACMP administration Unit 8 - EventsExercise 7
Exercise 8
Day 5
Unit 9 - Integrating NFS into HACMP
Unit 10 - Problem determination and recovery
Exercise 9
Exercise 10
Agenda
xvii
Student Notebook
Text highlighting
The following text highlighting conventions are used throughout this book:
Bold
Italics
Monospace
Monospace bold
V4.0
Student Notebook
Uempty
0-1
Student Notebook
Course objectives
After completing this unit, you should be able to:
Define high availability.
Outline the capabilities of HACMP for AIX
Design and plan a highly available cluster
Install and configure HACMP in the following modes of
operation:
Single resource group on a primary node with a standby node
Two resource groups in a mutual takeover configuration
AU548.0
Notes:
0-2
HACMP Implementation
V4.0
Student Notebook
Uempty
Course agenda (1 of 5)
Day 1
Welcome
Unit 1 - Introduction to HACMP for AIX 5L
Unit 2 - Networking Considerations for High Availability
Exercise 1
Exercise 2
AU548.0
Notes:
0-3
Student Notebook
Course agenda (2 of 5)
Day 2
AU548.0
Notes:
0-4
HACMP Implementation
V4.0
Student Notebook
Uempty
Course agenda (3 of 5)
Day 3
Unit 6 - Initial Cluster Configuration
Exercise 6
AU548.0
Notes:
0-5
Student Notebook
Course agenda (4 of 5)
Day 4
AU548.0
Notes:
0-6
HACMP Implementation
V4.0
Student Notebook
Uempty
Course agenda (5 of 5)
Day 5
AU548.0
Notes:
0-7
Student Notebook
Lab exercises
Points to note:
Work as a team and split the workload.
Manuals are available online.
HACMP software has been loaded and might have already been
installed.
TCP/IP and LVM have not been configured.
Each lab must be completed successfully before continuing on to the
next lab, as each lab is a prerequisite for the next one.
If you have any questions, ask your instructor.
AU548.0
Notes:
0-8
HACMP Implementation
V4.0
Student Notebook
Uempty
Italics
Monospace
Monospace bold
AU548.0
Notes:
0-9
Student Notebook
AU548.0
Notes:
V4.0
Student Notebook
Uempty
References
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
1-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Define High Availability and explain why it is needed
List the key considerations when designing and implementing
a high availability cluster
Outline the features and benefits of HACMP for AIX
Describe the components of an HACMP for AIX cluster
Explain how HACMP for AIX operates in typical cases
AU548.0
Notes:
Objectives
In this unit, we introduce the concept of High Availability, examine why you might want
to implement a High Availability solution, and compare High Availability with some
alternative availability technologies.
HACMP terminology
This course uses the following terminology:
- HACMP means any version and release of the HACMP product.
- HACMP x means version x and any release of that version.
- HACMP x.y means a specific version and release.
1-2
HACMP Implementation
V4.0
Student Notebook
Uempty
1-3
Student Notebook
AU548.0
Notes:
1-4
HACMP Implementation
V4.0
Student Notebook
Uempty
WAN
Standby Node/LPAR
Production Node/LPAR
client
Copyright IBM Corporation 2008
AU548.0
Notes:
High Availability characteristics
A High Availability solution ensures that the failure of any component of the solution, be
it hardware, software, or system management, does not cause the application and its
data to be inaccessible to the user community. This is achieved through the elimination
or masking of both planned and unplanned downtime. High availability solutions should
eliminate single points of failure (SPOF) through appropriate design, planning, selection
of hardware, configuration of software, and carefully controlled change management
discipline. High Availability does not mean no interruption to the application; thus, we
say fault resilient instead of tolerant.
1-5
Student Notebook
Node
Power source
Network adapter
Network
TCP/IP subsystem
Disk adapter
Disk
Application
VIO Server
Site
AU548.0
Notes:
Eliminating single points of failure
Each of the items in the left-hand column is a physical or logical component which, if it
fails, renders the HA clusters application unavailable.
Remember that generally some SPOFs are not eliminated. For example, most clusters
are not designed to deal with the server room being flooded with water, or with the
entire city being without electrical power for two weeks. Site recovery would be a
possible solution here using HACMP/XD.
Focus on the art of the possible. In other words, spend your efforts dealing with SPOFs
that can be reasonably handled.
Document the SPOFs which you have decided to not deal with; then you can review
them from time to time to consider whether some of them now need to be dealt with (for
example, site failures if cluster becomes very important).
1-6
HACMP Implementation
V4.0
Student Notebook
Uempty
AU548.0
Notes:
High availability clustering
The High Availability solution addresses the fundamental weakness of both the
stand-alone and stand-alone enhanced storage systems; that is, it has two of
everything. If any component of the solution should fail, a redundant back-up
component is waiting to take over the workload. The systems that are clustered can be
stand-alone systems or Logical Partitions (LPARs). Virtualization is supported as well,
providing all of the requirements are met. Those will be pointed out later.
Do feel free to examine the high-availability solutions offered by our competitors. IBMs
HACMP product has been ranked (and continues to be ranked) the leading
high-availability solution for UNIX servers by D.H. Brown Associates
(www.dhbrown.com) for many years. We are confident that by the end of this course,
youll also agree that HACMP 5 is a mature, robust, and feature-rich product that
delivers significantly improved availability on the IBM System p platform.
1-7
Student Notebook
Drawback
The base product HACMP 5 only partially solves the site SPOF in the case where data
does not have to be replicated. This can be done with LVM mirroring using SAN
technology.
1-8
HACMP Implementation
V4.0
Student Notebook
Uempty
Distance unlimited
Application, disk, and network independent
Automated site failover and reintegration
A single cluster across two sites
Get more details in HACMP System Administration III AU620
Metro Mirror/PPRC
GLVM
GeoRM
Toronto
Data Replication
Brussels
AU548.0
Notes:
What about Site Failure and data replication?
Limited distance
The base product HACMP 5.2 and later allows you to create sites as long as you can
use LVM mirroring for redundancy. Using SAN technology, you can get limited distance
support for site failures.
Extended distance
The HACMP/XD (Extended Distance) priced feature provides three distinct software
solutions for disaster recovery. These solutions enable an HACMP cluster to operate
over extended distances at two sites. For more information, see the HACMP System
Administration III: Virtualization and Disaster Recovery course, AU620.
a. HACMP/XD for Metro Mirror/PPRC increases data availability for IBM TotalStorage
ESS/DS/SVC volumes that use Peer-to-Peer Remote Copy (PPRC) to copy data to
a remote site for disaster recovery purposes. HACMP/XD for Metro Mirror/PPRC
Copyright IBM Corp. 1998, 2008
1-9
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
HACMP characteristics
IBMs HACMP product is a mature and robust technology for building a high-availability
solution. A high-availability solution based upon HACMP provides automated failure
detection, diagnosis, recovery and reintegration. With an appropriate application,
HACMP can also work in a concurrent access or parallel processing environment, thus
offering excellent horizontal scalability.
1-11
Student Notebook
Customization
The process of augmenting HACMP, typically via implementing scripts
Minimum: application start and stop scripts
Optional:
Application monitoring scripts (highly recommended!)
Event customization
Notification, pre- and post-event scripts, recovery scripts, user-defined events, time until warning
(config_too_long timeout)
AU548.0
Notes:
Terminology
A clear understanding of the above concepts and terms is important as they appear
over and over again both in the remainder of the course and throughout the HACMP
documentation, log files and SMIT screens.
V4.0
Student Notebook
Uempty
clstrmgr
clstrmgr
ur c e
Reso group
Shared Storage
Node
Node
Fallover
AU548.0
Notes:
Fundamental concepts
HACMP is based on the fundamental concepts of cluster, resource group, and cluster
manager (clstrmgr).
Cluster
A cluster is comprised basically of nodes, networks, and network adapters. These
objects are referred to as Topology objects.
Resource group
A resource group is typically comprised of an application, network address, and volume
group using shared disks. These objects are referred to as Resource objects.
1-13
Student Notebook
clstrmgr
The cluster manager daemons together are the software components that
communicate with each other to control on which node a resource group is activated or
where the resource group is moved on a fallover based on parameters set up by the
administrator. The clstrmgr runs on all the nodes of the cluster.
Here is a simple diagram of a two-node cluster, using shared disk, and providing
fallover for a single application.
V4.0
Student Notebook
Uempty
IP ork
tw
Ne
-IP k
on or
N tw
e
N
Communication
Interface
n
tio
ca
i
un e
m evic
m
Co D
No
de
r
ste
Clu
AU548.0
Notes:
Topology components
A clusters topology is the cluster, nodes (pSeries servers), networks (connections
between the nodes), the communication interfaces (for example, Ethernet or token-ring
network adapters), and the communication devices (/dev/rhdisk for heartbeat on disk or
/dev/tty for RS232 for example).
Nodes
In the context of HACMP, the term node means any IBM pSeries system that is a
member of a high-availability cluster running HACMP.
1-15
Student Notebook
Networks
Networks consist of IP and non-IP networks. The non-IP networks ensure that cluster
monitoring can be done if there is a total loss of IP communication. Non-IP networks are
strongly recommended to be configured in an HACMP.
Networks can also be logical or physical. Logical networks have been used with the IBM
SP environments when different frames were in different subnets but needed to be
treated as if they were in the same network for HACMP purposes.
V4.0
Student Notebook
Uempty
Node
Networking
PC
Ethernet
Server
Server
Server
Server
Heartbeat on Disk
RS232/422
Non-IP
Non -IP
Shared storage
Physical
RS/6000
DS4000
RS/6000
SAN
Virtual SCSI
DS8000
IBM
Fibre
AU548.0
Notes:
Supported nodes
As you can see, the range of systems that supports HACMP is, well, everything. The
only requirement is that the system should have at least four adapter slots spare (two
for network adapters and two for disk adapters). Any other adapters (for example,
graphics adapters) occupy additional slots. The internal Ethernet adapter fitted to most
entry-level pSeries servers cannot be included in the calculations. It should be noted
that even with four adapter slots free, there is still be a single point of failure as the
cluster is able to accommodate only a single TCP/IP local area network between the
nodes.
HACMP 5 works with pSeries servers in a no-single-point-of-failure server
configuration. HACMP for AIX supports the System p models that are designed for
server applications and that meet the minimum requirements for internal memory,
internal disk, and I/O slots. For a current list of systems that are supported with the
version of HACMP that you want to use, see the Sales manual at
Copyright IBM Corp. 1998, 2008
1-17
Student Notebook
LPAR support
There is also support for dynamically adding LPAR resources in AIX V5.2 or later LPAR
environments to take advantage of Capacity Upgrade of Demand (CUoD).
HACMP 5.2 (and later) supports Virtual SCSI (VSCSI) and Virtual LAN (VLAN) on
POWER5 (IBM System p5 and IBM System i5). See
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10390 for
more details.
Supported networks
HACMP 5 supports client users on a LAN using TCP/IP. HACMP monitors and performs
IP address switching for the following TCP/IP-based communications adapters on
cluster nodes:
-
Ethernet
EtherChannel
Token ring
FDDI
SP Switches
ATM
ATM LAN Emulation
HACMP also supports non-IP networks, such as RS232/442, Target Mode SCSI
(TMSCSI), Target Mode SSA (TMSSA), and Heartbeat on Disk (using Enhanced
Concurrent Mode Volume Groups).
It is highly recommended to have both IP and non-IP networks defined to HACMP. For a
list of specific adapters, you can consult the Sales Manual.
1-18 HACMP Implementation
V4.0
Student Notebook
Uempty
Unsupported networks
The following networks are not supported:
-
1-19
Student Notebook
ica
pl
Ap
tio
n
er
rv
Se
Se
Ad rvic
dr e I
es P
s
Vo
Gr lum
ou e
p
le
Fi tem
s
Sy
Group
e
c
r
u
Re s o
s
Node e Policies
im
Run-t ces
ur
Reso
Copyright IBM Corporation 2008
AU548.0
Notes:
Resource group
A resource group is a collection of resources treated as a unit along with the nodes that
they can potentially be activated on and what policies the cluster manager should use to
decide which node to choose during startup, fallover, and fallback. A cluster can have
more than one resource group (usually one for each application), thus allowing for very
flexible configurations. Resource groups will be covered in more detail in Unit 4.
Resources
Resources are logical components that can be put into a resource group. Because they
are logical components, they can be moved without human intervention.
The resources shown in the visual are a typical set of resources used in resource
groups, such as:
V4.0
Student Notebook
Uempty
1-21
Student Notebook
What is HACMP?
An application which:
clcomdES
Topology
manager
Resource
manager
Event
manager
SNMP
manager
RSCT
(topsvcs, grpsvcs, RMC
subsystems)
snmpd
clinfoES
clstat
Copyright IBM Corporation 2008
AU548.0
Notes:
HACMP core components
HACMP comprises of a number of software components:
- The cluster manager clstrmgr is the core process that monitors cluster membership.
The cluster manager includes a topology manager to manage the topology
components, a resource manager to manage resource groups, an event manager
with event scripts that works through the RMC facility, and RSCT to react to failures.
- In HACMP 5.3 and later, the cluster manager contains the SNMP SMUX Peer
function (previously provided by the clsmuxpd) for the cluster manager MIBs, which
allows for SNMP-based monitoring to be done manually or by using an SNMP
manager, such as Tivoli, BMC, or OpenView.
- The clinfo process provides an API for communicating between cluster manager
and your application. Clinfo also provides remote monitoring capabilities and can
run a script in response to a status change in the cluster. Clinfo is an optional
1-22 HACMP Implementation
V4.0
Student Notebook
Uempty
process that can run on both servers and clients (the source code is provided). The
clstat command uses clinfo to display status via ASCII, Xwindow, or Web browser
interfaces.
- In HACMP 5, clcomdES allows the cluster managers to communicate in a secure
manner without using rsh and the /.rhost files.
1-23
Student Notebook
OLPW
smit via web
Verification
Auto tests
clstrmgrES
CSPOC
DARE
SNMP
Tivoli
Integration
Application
Monitoring
AU548.0
Notes:
Additional features
HACMP also has additional software to provide facilities for administration, testing,
remote monitoring, and verification:
- Application monitoring should be used to monitor the clusters applications and
restart them should they fail. Multiple monitors can be defined for an application,
including monitoring the startup.
- Configuration changes can be made to the cluster while the cluster is running. This
facility is known as Dynamic Automatic Reconfiguration Event (or DARE for short).
- C-SPOC is a series of SMIT menus that allow AIX-related cluster tasks to be
propagated across all nodes in the cluster. It includes an RG_Move facility, which
allows a resource group to be placed offline or on another node without stopping the
cluster manager.
V4.0
Student Notebook
Uempty
1-25
Student Notebook
Minimum:
Application Start/Stop/Monitor scripts
Optional:
Customized pre/post event scripts
Reaction to events
Error notification Methods
User Defined Events (UDEs)
Cluster State Change
AU548.0
Notes:
Not just HACMP
The final high-availability solution is more than just HACMP. A high-availability solution
comprises a reliable operating system (AIX), applications that are tested to work in a
high-availability cluster, storage devices, appropriate selection of hardware, trained
administrators, and thorough design and planning.
Customization required
HACMP is shipped with event scripts (Korn Shell scripts) which handle the failure
scenarios.
Application Server start/stop scripts are written to control the application(s) based on
the status of the cluster nodes. Most often, all the script writing that is required to
integrate an application into the cluster is done in the Application Server start/stop
scripts.
1-26 HACMP Implementation
V4.0
Student Notebook
Uempty
Smart Assists are provided in HACMP (since HACMP 5.2) to help ease the
customization for the applications that they address. In HACMP 5.4 and later, an API is
provided that allows third-party application vendors to write Smart Assists.
In the rare circumstance where you have a requirement to customize some special
fallover behavior, this is done with pre- and post-event scripts.
1-27
Student Notebook
Lets review
1.
2.
3.
4.
Which of the following items are examples of topology components in HACMP? (Select
all that apply.)
a. Node
b. Network
c. Service IP label
d. Hard disk drive
True or False?
All nodes in an HACMP cluster must have roughly equivalent performance
characteristics.
Which of the following is a characteristic of high availability?
a. High availability always requires specially designed hardware components.
b. High availability solutions always require manual intervention to ensure
recovery following fallover.
c. High availability solutions never require customization.
d. High availability solutions use redundant standard equipment (no specialized
hardware).
True or False?
A thorough design and detailed planning is required for all high availability solutions.
AU548.0
Notes:
V4.0
Student Notebook
Uempty
1-29
Student Notebook
AU548.0
Notes:
In this topic, we take a look at what HACMP does.
V4.0
Student Notebook
Uempty
HACMP functions:
AU548.0
Notes:
HACMP basic functions
HACMP detects three kinds of network related failures.
a. A communications adapter or device failure
b. A node failure (all communication adapters/devices on a given node)
c. A network failure (all communication adapters/devices on a given network)
HACMP also interfaces to the AIX error log to respond to the loss of quorum for a
volume group when the loss is detected by the LVM. Most other failures are handled
outside of HACMP, either by AIX or LVM, and can be handled in HACMP via
customization.
1-31
Student Notebook
AU548.0
Notes:
How HACMP responds to a failure
HACMP generally responds to a failure by using a still available component to take over
the duties of the failed component. For example, if a node fails, then HACMP initiates a
fallover, an action that consists of moving the resource groups that were previously on
the failed node to a surviving node. If a Network Interface Card (NIC) fails, HACMP
usually moves any IP addresses being used by clients to another available NIC. If there
are no remaining available NICs, HACMP initiates a fallover. If only one resource group
is affected, then only the one resource group is moved to another node.
V4.0
Student Notebook
Uempty
?
How the cluster responds to the recovery of a failed component
depends on what has recovered, what the resource group's fallback
policy is, and the resource group dependencies:
AU548.0
Notes:
How HACMP responds to a recovery
When a previously failed component recovers, it must be reintegrated back into the
cluster (reintegration is the process of HACMP recognizing that the component is
available for use again). Some components, such as NICs, are automatically
reintegrated when they recover. Other components, such as nodes, cannot be
reintegrated until the cluster administrator explicitly requests the reintegration (by
starting the HACMP daemons on the recovered node, starting cluster services, and
possibly moving the resource group, or bringing it online).
1-33
Student Notebook
USA returns
Node UK fails
(no change)
RG can be configured
to come online on the
primary or any node
UK returns
AU548.0
Notes:
Standby
Standby configurations are configurations where one (or more) nodes have no
workload.
V4.0
Student Notebook
Uempty
Drawbacks
- One node is not used (this is ideal for availability but not from a utilization
perspective).
- A second outage on the fallback is possible.
All nodes, except one, have applications, and the one node is a standby node.
This could lead to performance problems if more than one application must be
moved to the standby node.
ii. The resource group could be configured to have multiple layers of back-up
nodes. The resource group would usually be configured to run on the highest
priority (most preferred) available node.
Multiple layers of back-up nodes are possible--fallover policy determines which
node. For example: primary -> secondary -> tertiary -> quaternary -> quinary ->
senary -> septenary -> octonary -> nonary -> denary...
A tidbit for the wordsmiths in the audience: The sequence, which starts primary,
secondary, and tertiary, continues with quaternary, quinary, senary, septenary,
octonary, nonary, and denary. There is no generally accepted word for eleventh
order although duodenary means twelfth order. The word for twentieth order is
vigenary.
1-35
Student Notebook
USA fails
UK returns
Eliminates another
outage
Reduces downtime
UK fails
USA returns
A
AU548.0
Notes:
Minimize downtime
A resource group can be configured to not fall back to the primary node (or any other
higher priority node) when it recovers. This avoids the second outage, which results
when the fallback occurs.
The cluster administrator can request that HACMP move the resource group back to
the higher priority node at an appropriate time or it can simply be left on its current node
indefinitely (an approach that calls into question the terms primary and secondary, but
which is actually quite a reasonable approach in many situations).
V4.0
Student Notebook
Uempty
USA fails
UK fails
Very common
B A
No one node/LPAR
is left idle
A B
UK returns
USA returns
(with Fallback)
(with Fallback)
AU548.0
Notes:
Takeover
Takeover configurations imply that there is workload on all nodes which might or might
not be under the control of HACMP, but that a node can take over the work of another
node in the cluster.
Mutual takeover
An extension of the primary node with a secondary node configuration is to have two
resource groups, one failing from right to left and the other failing from left to right. This
is referred to as mutual takeover.
Mutual takeover configurations are very popular configurations for HACMP because
they support two highly available applications at a cost, which is not that much more
than would be required to run the two applications in separate stand-alone
configurations.
Copyright IBM Corp. 1998, 2008
1-37
Student Notebook
Additional costs
Note that there are at least a few additional costs:
- Each cluster node probably needs to be somewhat larger than the stand-alone
nodes because they must each be capable of running both applications, possibly in
a slightly degraded mode, should one of the nodes fail.
- Additional software licenses might be required for the applications when they run on
their respective back-up nodes (this is a potentially significant cost item, which is
often forgotten in the early cluster planning stages).
- HACMP for AIX license fees.
- This is not intended to be an all inclusive list of additional costs.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Concurrent mode
HACMP also supports resource groups in which the application is active on multiple
nodes simultaneously. In such a resource group, all nodes run a copy of the application
and share simultaneous access to the disk. This style of cluster is often referred to as a
concurrent access cluster or concurrent access environment.
Service labels
Since the application is active on multiple nodes, each node has its own service IP
label. The client systems must be configured to randomly (or otherwise) select which
service IP address to communicate with, and be prepared to switch to another service
IP address should the one that theyre dealing with stop functioning (presumably,
because the node with the service IP address has failed). It is also possible to configure
an IP multiplexer between the clients and the cluster which redistributes the client
1-39
Student Notebook
sessions to the cluster nodes, although care must be taken to ensure that the IP
multiplexer does not itself become a single point of failure.
How to choose
Whether this mode of operation can be used for your application is a function of the
application, not of HACMP.
V4.0
Student Notebook
Uempty
Points to ponder
Resource groups:
Must be serviced by at least two nodes
Can have different policies
Can be migrated (manually or automatically) to rebalance loads
Clusters:
Applications:
Can be restarted via monitoring
Must be manageable via scripts (start/restart and stop)
* Application performance requirements and other operational issues
almost certainly impose practical constraints on the size and
complexity of a given cluster.
Copyright IBM Corporation 2008
AU548.0
Notes:
Importance of planning
Planning, designing, configuring, testing, and operating a successful HACMP cluster
requires considerable attention to detail. In fact, a careful methodical approach to all the
phases of the clusters life-cycle is probably the most important factor in determining the
ultimate success of the cluster.
Methodical approach
A careful methodical approach considers the relevant points above, and many other
issues that are discussed this week or that are discussed in the HACMP
documentation.
1-41
Student Notebook
People
High
availability
Data
Continuous
availability
Networking
Continuous
operation
Hardware
Environment
Software
Copyright IBM Corporation 2008
AU548.0
Notes:
Design, planning, testing
Design, planning, and testing are all critical steps that cannot be skipped when
implementing a high-availability solution. As youll learn this week, there should be no
shortage of time spent designing, planning, and documenting your proposed cluster
solution. Time well spent in these areas of the project reduces the amount of unneeded
administration time required to manage your cluster solution.
Unfortunately, its too often the case that there isnt enough time to do it right first time,
but always time enough to do it over when things go wrong.
Remember the reason why we worry about node failures and disk failures and such is
not because we are particularly concerned with their actual failure, but rather we are
concerned with the impact that their failure might have.
V4.0
Student Notebook
Uempty
1-43
Student Notebook
AU548.0
Notes:
Things HACMP does not do
HACMP does not automate your back-ups, neither does it keep time in sync between
the cluster nodes. These tasks do require further configuration and software; for
example, Tivoli Storage Manager for back-up and a time protocol, such as xntp for time
synchronization.
V4.0
Student Notebook
Uempty
Security issues
Too little security
Many people can change the environment.
Unstable environments
HACMP cannot make an unstable and poorly managed environment
stable.
HACMP tends to reduce the availability of poorly managed systems.
AU548.0
Notes:
Zero downtime
An example of zero down time is the intensive care room. Also HACMP is not designed
to handle many failures at once.
Security issues
One security issue that is now addressed is the need to eliminate .rhost files. Also
there is better encryption possible with inter node communications, but this might not be
enough for some security environments.
Unstable environments
The prime cause of problems with HACMP is poor design, planning, implementation,
and administration. If you have an unstable environment, with poorly trained
1-45
Student Notebook
administrators, easy access to the root password, and a lack of change control,
HACMP is not the solution for you.
With HACMP, the only thing more expensive than employing a professional to plan,
design, install, configure, customize, and administer the cluster is employing an
amateur.
Other characteristics of poorly managed systems are:
-
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Goals
During this week you will design, plan, configure, customize, and administer a two-node
high-availability cluster running HACMP 5.4.1 on an AIX system.
You will learn how to build a standby environment for one application as well as a
mutual takeover environment for two applications. In the mutual takeover environment,
each system will eventually be running its own highly available application, and
providing fallover back-up for the other system.
Some classroom environments will involve creating the cluster on a single pSeries
system between two LPARs. Although this is not a recommended configuration for
production, it provides the necessary components for a fruitful HACMP configuration
experience.
1-47
Student Notebook
AU548.0
Notes:
Implementation process
The process should include at least the following:
Work as a team. It cannot be stressed enough that it will be necessary to work with
others when you build your HACMP cluster in your own environment. Practice here will
be useful.
Look at the AIX environment.
- For storage, plan for adapters and LVM components required for application.
- For networks, plan and for communication interfaces, devices, name resolution via
/etc/hosts and service address for the application.
- For application build start and stop script and test outside of the control of HACMP.
Install the HACMP for AIX software and reboot.
Configure the topology and resource groups (and resources).
V4.0
Student Notebook
Uempty
1-49
Student Notebook
hints
Node A
Service
Boot
Standby
IP Label
database
nodeaboot
nodeastand
user
community
IP Address Netmask
192.168.9.3 255.255.255.0
192.168.9.4 255.255.255.0
192.168.254.3 255.255.255.0
Node A
Service
Boot
Standby
IP Label
webserv
nodebboot
nodebstand
IP Address Netmask
192.168.9.5 255.255.255.0
192.168.9.6 255.255.255.0
192.168.254.3 255.255.255.0
Public Network
Node Name
Resource group
Applications
Resources
A-B
Priority
CWOF
= nodea
= dbrg
= database
= cascading
Label
Device
= a_tmssa
= /dev/tmssa1
Label
Device
= a_tty
= /dev/tty1
= 1,2
= yes
Node Name
Resource group
Applications
Resources
B-A
Priority
CWOF
=nodeb
= httprg
= http
= cascading
Label
Device
= b_tmssa
= /dev/tmssa2
Label
Device
= a_tty
= /dev/tty1
= 2,1
= yes
tmssa network
Draw a diagram.
Use (online) planning sheets.
Focus on eliminating SPOFs.
Always factor in a non-IP network.
Ensure that you have multipath
access to shared storage devices.
Document a test plan.
Test the cluster carefully.
Be methodical.
serial network
rootvg
raid1
9.1GB
VG =httpvg
Raid1
9GB
rootvg
raid1
9.1GB
VG = dbvg
Raid5
100GB
AU548.0
Notes:
Hint
Create a cluster diagram--a picture is worth 10 thousand words (because of inflation, a
thousand is not enough!).
Use the Online Planning Worksheets. They can be used without installing HACMP and
can be used to generate AND save HACMP configurations.
Try to reduce SPOFs.
Always include a non-IP network.
Access storage over multiple paths or mirror across power and buses.
Document test plan. HACMP also provides test scripts called auto test.
Be methodical.
Execute the test plan prior to placing the cluster into production!
V4.0
Student Notebook
Uempty
cluster.doc.en_US.es.html
cluster.doc.en_US.es.pdf
http://www.ibm.com/servers/eserver/pseries/library/hacmp_docs.html
/usr/es/sbin/cluster/release_notes
http://www-03.ibm.com/systems/p/ha/
http://lpar.co.uk
http://portal.explico.de/
http://www.matilda.com/hacmp/
http://groups.yahoo.com/group/hacmp/
Copyright IBM Corporation 2008
AU548.0
Notes:
Manuals on CD
The HACMP 5.4.1 manuals are:
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
SC23-4864-10 HACMP for AIX, Version 5.4.1: Concepts and Facilities Guide
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
1-51
Student Notebook
Checkpoint
1. True or False?
Resource Groups can be moved from node to node.
2. True or False?
HACMP/XD is a complete solution for building
geographically distributed clusters.
3. Which of the following capabilities does HACMP not
provide? (Select all that apply.)
a. Time synchronization
b. Automatic recovery from node and network adapter failure
c. System Administration tasks unique to each node; back-up and
restoration
d. Fallover of just a single resource group
4. True or False?
All nodes in a resource group must have equivalent
performance characteristics.
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Having completed this unit, you should be able to:
Define high availability and explain why it is needed
Outline the various options for implementing high availability
List the key considerations when designing and implementing
a high availability cluster
Outline the features and benefits of HACMP for AIX
Describe the components of an HACMP for AIX cluster
Explain how HACMP for AIX operates in typical cases
AU548.0
Notes:
1-53
Student Notebook
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
2-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Discuss how HACMP uses networks
Describe the HACMP networking terminology
Explain and set up IP Address Takeover (IPAT)
Configure an IP network for HACMP
Configure a non-IP network
Explain how client systems are likely to be affected by failure
recovery
Minimize the impact of failure recovery on client systems
AU548.0
Notes:
Unit objectives
This unit discusses networking in the context of HACMP.
2-2
HACMP Implementation
V4.0
Student Notebook
Uempty
2-3
Student Notebook
AU548.0
Notes:
Topic 1 objectives
This topic explores how HACMP uses networks. The HACMP concept of IP Address
Takeover (IPAT), where application addresses are relocated when failures occur will be
looked at in more detail in a later section.
2-4
HACMP Implementation
V4.0
Student Notebook
Uempty
en0
en1
en0
en1
2
RSCT
RSCT
3
clcomd
clcomd
AU548.0
Notes:
Network design for availability
To design a network that supports high availability using HACMP, we must understand
how HACMP uses networks.
2-5
Student Notebook
administrators. Just being able to detect node, network, and NIC failures imposes
several requirements on how the networks are designed. Being able to distinguish
between certain failures (for example the failure of a network and the failure of a node),
imposes yet more requirements on the network design.
Reliable Scalable Cluster Technology (RSCT) provides facilities for monitoring node
membership; network interface and communication interface health; and event
notification, synchronization, and coordination via reliable messaging.
2-6
HACMP Implementation
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Network interface card and single point of failure
When using physical networking and not Etherchannel (more on these topics in a few
visuals), the goal is to avoid the NIC being a single point of failure. To achieve that, each
cluster node requires at least two NICs per network. The alternative is that the loss of a
single NIC would cause a significant outage while the application (that is, the resource
group) is moved to another node.
For etherchannel or virtual ethernet configurations, the norm is to have only a single
interface in the network. The use of a special file that provides additional addresses for
diagnosis processing is necessary. That file is called the netmon.cf. We will see more
on that in a few visuals.
2-7
Student Notebook
Network as SPOF
The network itself is, of course, a single point of failure because the failure of the
network will disrupt the users ability to communicate with the cluster. The probability of
this SPOF being an issue can be reduced by careful network design, an approach that
is often considered sufficient.
2-8
HACMP Implementation
V4.0
Student Notebook
Uempty
IP network
en0
en1
en0
en1
non-IP network
uk
usa
AU548.0
Notes:
Failures that HACMP handles directly
HACMP uses RSCT to detect failures. Actually, the only thing that RSCT can detect is
the loss of heartbeat packets. RSCT sends heartbeats over IP and non-IP networks. By
gathering heartbeat information from multiple NICs and non-IP devices on multiple
nodes, HACMP makes a determination of what type of failure this is and takes
appropriate action. Using the information from RSCT, HACMP handles only three
different types of failures:
- NIC failures
- Node failures
- Network failures
2-9
Student Notebook
Other failures
HACMP uses AIX features to respond to other failures (for example, the loss of a
volume group can trigger a fallover), but HACMP is not directly involved in detecting
these other types of failures.
V4.0
Student Notebook
Uempty
Heartbeat packets
HACMP sends heartbeat packets across networks.
Heartbeat packets are sent and received by every NIC.
This is sufficient to detect all NIC, node, and network failures.
Heartbeat packets are not acknowledged.
en0
en1
en0
Application
Data
usa
en1
uk
AU548.0
Notes:
Heartbeat packets
HACMPs primary monitoring mechanism is to send heartbeat packets. The cluster
sends heartbeat packets from every NIC and to every NIC and to and from non-IP
devices.
Heartbeating pattern
In a typical two-node cluster with two NICs on the network, the heartbeat packets are
sent in the pair-wise fashion shown above. The pattern gets more complicated when the
cluster gets larger as HACMP uses a pattern that is intended to satisfy three
requirements:
- That each NIC be used to send heartbeat packets (to verify that the NIC is capable
of sending packets)
2-11
Student Notebook
- That heartbeat packets be sent to each NIC (to verify that the NIC is capable of
receiving heartbeat packets)
- That no more heartbeat packets are sent than are necessary to achieve the first two
requirements (to minimize the load on the network)
The details of how HACMP satisfies the third requirement are discussed in a later unit.
Detecting failures
Heartbeat packets are not acknowledged. Instead, each node knows what the
heartbeat pattern is and simply expects to receive appropriate heartbeat packets on
appropriate network interfaces. Noticing that the expected heartbeat packets have
stopped arriving is sufficient to detect failures.
V4.0
Student Notebook
Uempty
en0
en1
en0
Application
Data
usa
en1
uk
AU548.0
Notes:
Diagnosis
The heartbeat patterns just discussed are sufficient to detect a failure in the sense of
realizing that something is wrong. They are not sufficient to diagnose a failure in the
sense of figuring out exactly what is broken.
For example, if the en1 interface on the usa node fails as in the visual above, usa stops
receiving heartbeat packets via its en1 interface, and uk stops receiving heartbeat
packets via its en1 interface. Usa and uk both realize that something has failed, but
neither of them has enough information to determine what has failed.
2-13
Student Notebook
Failure diagnosis
When a failure is detected, HACMP (RSCT topology services)
uses specially crafted packet transmission patterns to
determine (that is, diagnose) the actual failure by ruling out
other alternatives.
Example:
1.
2.
3.
4.
RSCT on usa notices that heartbeat packets are no longer arriving via en1 and
notifies uk (which has also noticed that heartbeat packets are no longer arriving via
its en1).
RSCT on both nodes send diagnostic packets between various combinations of
NICs (including out via one NIC and back in via another NIC on the same node).
The nodes soon realize that all packets involving usa's en1 are vanishing but
packets involving uk's en1 are being received.
Diagnosis: usa's en1 has failed.
AU548.0
Notes:
Diagnostic heartbeat patterns
When one or more cluster nodes detect a failure, they share information and plan a
diagnostic packet pattern or series of patterns, which will diagnose the failure.
These diagnostic packet patterns can be considerably more network-intensive than the
normal heartbeat traffic; although, they usually only take a few seconds to complete the
diagnosis of the problem.
V4.0
Student Notebook
Uempty
en0
en1
en0
Application
Data
usa
en1
uk
Result is a
partitioned cluster
and likely data
divergence.
AU548.0
Notes:
Total loss of heartbeat traffic
If a node in a two-node cluster realizes that it is no longer receiving any heartbeat
packets from the other node, then it starts to suspect that the other node has gone
down. When it determines that it is totally unable to communicate with the other node, it
concludes that the other node has failed.
2-15
Student Notebook
Partitioned cluster
Because each node is, in fact, still very alive, the result is that the applications are now
running simultaneously on both nodes. If the shared disks are also online to both nodes,
then the result could be a massive data corruption problem. This situation is called a
partitioned cluster. It is, clearly, a situation that must be avoided.
Note that essentially equivalent situations can occur in larger clusters. For example, a
five-node cluster might become split into a group of two nodes and a group of three
nodes. Each group concludes that the other group has failed entirely and takes what it
believes to be appropriate action. The result is almost certainly very unpleasant.
V4.0
Student Notebook
Uempty
en0
en1
en1
en0
non-IP network
Application
Data
usa
uk
Copyright IBM Corporation 2008
AU548.0
Notes:
Required?
To be completely accurate, you do not have to configure a non-IP network. But for the
reasons outlined as follows, you will want to implement at least one non-IP network and
possibly more. So it is not technically accurate that a non-IP network is required, but it is
definitely practically accurate that one is required. That is why the title indicates
required, while the content of the visual indicates should.
2-17
Student Notebook
V4.0
Student Notebook
Uempty
en1
192.168.2.1
en0
192.168.1.1
en0
192.168.1.2
en1
192.168.2.2
non-IP network
usa
Note:
Doesnt apply for single
adapter networks, like
etherchannel or
virtual ethernet.
uk
AU548.0
Notes:
Requirements for HACMP to monitor every NIC
If a node has two NICs on the same logical IP subnet and a network packet is sent to an
IP address on the same logical subnet, then the AIX kernel is allowed to use either NIC
on the sending node to send the packet.
Because this is incompatible with HACMPs requirement that HACMP be able to dictate
which NIC is to be used to send heartbeat packets, HACMP requires that each NIC on
each node be on a different logical IP subnet.
We will give some examples of valid and invalid configurations later in this unit, after we
have covered the other subnetting rules.
Note: There is an exception to the requirement that each NIC be on a different logical IP
subnet. We will discuss that shortly.
2-19
Student Notebook
en0
en1
en0
usa
en1
uk
en0
en1
en0
usa
en1
uk
AU548.0
Notes:
NIC and network recovery
NICs and networks are automatically reintegrated into the cluster when they recover.
Node recovery
In contrast, a node is not considered to have recovered until the Cluster Services has
been started on the node. This allows the node to be rebooted and otherwise exercised
as part of the repair process without HACMP declaring failures or performing
reintegration or both, while the repair action is occurring.
The reintegration of a component might trigger quite significant actions. For example, if
a node is reintegrated, which has a high priority within a resource group, then,
depending on how the resource group is configured, the resource group might fall back.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
2-21
Student Notebook
V4.0
Student Notebook
Uempty
2-23
Student Notebook
AU548.0
Notes:
Topic 2 objectives
This section will explore HACMP networking concepts, terms and configuration rules in
more detail.
V4.0
Student Notebook
Uempty
FDDI
Token-Ring
ATM and ATM LAN Emulation
SP Switch 1 and SP Switch 2
RS232/RS422 (rs232)
Target Mode SSA (tmssa)
Target Mode SCSI (tmscsi)
Copyright IBM Corporation 2008
AU548.0
Notes:
Supported IP networks
HACMP supports all of the popular IP networking technologies (and a few that are
possibly not quite as popular). Note that the IEEE 802.3 Ethernet frame type is not
supported.
2-25
Student Notebook
from a node or nodes to the rest of the cluster nodes via all routes, IP and non-IP will be
treated as a loss of quorum. This in turn will cause the node or nodes to stop accessing
the data. This is to prevent (or minimize) data corruption in the event of a domain merge
(split brain). This is implemented only with Resource Groups that have a startup policy
of Online on All Nodes (OOAN, also known as concurrent resource groups).
V4.0
Student Notebook
Uempty
Network types
HACMP categorizes all networks:
IP networks:
Network type:
Network attribute:
ether,token,fddi,atm,
hps (SP Switch or High Performance Switch)
public or private
Non-IP networks:
Network type:
AU548.0
Notes:
IP networks
As mentioned before, IP networks are used by HACMP for:
- HACMP heartbeat (failure detection and diagnosis)
- Communications between HACMP daemons on different nodes
- Client network traffic
IP network attribute
The default for this attribute is public. Oracle uses the private network attribute
setting to select networks for Oracle inter-node communications. This attribute is not
used by HACMP itself. See the HACMP for AIX: Planning Guide for more information.
2-27
Student Notebook
Non-ip networks
HACMP uses non-IP networks for:
- Alternative non-IP path for HACMP heartbeat and messaging
- Differentiates between node/network failure
- Eliminates IP as a single point of failure
V4.0
Student Notebook
Uempty
IP lab
el
Ne
two
rk
s
IP addres
vancouver-service
192.168.5.2
rface
Communication Device
Serial
Port
non IP - rs232
non IP - diskhb
usa
non IP - mndhb
node
Network
Interface
Card
Network
Interface
Card
Network
Interface
Card
Network
Interface
Card
ne non
tw -IP
or
k
Serial
Port
nicatio
n Inte
name
non
- IP
non
- IP
net
wor
k
uk
net
wo
rk
AU548.0
Notes:
Terminology
HACMP has quite a few special terms that are used repeatedly throughout the
documentation and the HACMP smit screens. Over the next few visuals we will discuss
some of the network related terminology in detail.
- node
An IBM system p server operating within an HACMP cluster
- node name
The name of a node from HACMPs perspective
- IP label
For TCP/IP networks, the name specified in the /etc/hosts file or by the Domain
Name Service for a specific IP address
2-29
Student Notebook
In many configurations, HACMP nodes will have multiple NICs, and thus multiple IP
labels, but only one hostname. We will look at the relationship between hostname,
node name, and IP labels in the next visual.
In HACMP, IP labels are either service IP labels or non-service IP labels. We will
discuss this distinction in the next few visuals.
- IP network
A network that uses the TCP/IP family of protocols
- non-IP network or serial network
A point-to-point network, which does not rely on the TCP/IP family of protocols
- communication interface
A network connection onto an IP network (slightly better definition coming shortly)
- communication device
A port or device connecting a node to a non-IP network (slightly better definition
coming shortly)
V4.0
Student Notebook
Uempty
Naming nodes
A node can have several names, including the AIX hostname, the
HACMP node name, and one of the IP labels. These concepts
should not be confused.
AIX hostname
# hostname
gastown
# uname -n
gastown
# usr/es/sbin/cluster/utlities/get_local_nodename
vancouver
IP labels
# netstat -i
Name Mtu Network
lo0 16896 link#1
lo0 16896 127
lo0 16896 ::1
tr0 1500 link#2
tr0 1500 192.168.1
tr1 1492 link#3
tr1 1492 192.168.2
tr2 1492 link#4
tr2 1492 195.16.20
Address
localhost
0.4.ac.49.35.58
vancouverboot1
0.4.ac.48.22.f4
vancouverboot2
0.4.ac.4d.37.4e
db-app-svc
Ipkts
5338
5338
5338
76884
76884
476
476
5667
5667
Ierrs
0
0
0
0
0
0
0
0
0
Opkts
5345
5345
5345
61951
61951
451
451
4500
4500
Oerrs
0
0
0
0
0
13
13
0
0
Coll
0
0
0
0
0
0
0
0
0
AU548.0
Notes:
Hostname
Each node within an HACMP cluster has a hostname associated with it that was
assigned when the machine was first installed onto the network. For example, a
hypothetical machine might have been given the name gastown.
2-31
Student Notebook
IP labels
Each IP address used by an HACMP cluster almost certainly has an IP label associated
with it. In non-HACMP systems, it is not unusual for the systems only IP label to be the
same as the systems hostname. This is rarely a good naming convention within an
HACMP cluster because there are just so many IP labels to deal with, and having to
pick which one gets a name that is the same as a nodes hostname is a pointless
exercise.
V4.0
Student Notebook
Uempty
Communication Device:
A communication device refers to one end of a point-to-point
non-IP network connection, such as /dev/tty1, /dev/hdisk1 or
/dev/tmssa1.
Communication Adapter:
A communication adapter is an X.25 adapter used to support a
Highly Available Communication Link.
AU548.0
Notes:
HACMP network terminology
When using HACMP SMIT, it is important to understand the difference between
communication interfaces, devices and adapters:
- Communication interfaces:
Interfaces for IP-based networks
Note: The term communication interface in HACMP refers to more than just the
physical NIC. From HACMPs point of view, a communication interface is an object
defined to HACMP, which includes:
The logical interface (the name for the physical NIC), such as en0
The IP label / address
- Communication devices:
Devices for non-IP networks
- Communication adapters:
X.25 adapters
Copyright IBM Corp. 1998, 2008
2-33
Student Notebook
AU548.0
Notes:
More HACMP terminology
Another set of important terms are service, non-service, and persistent:
V4.0
Student Notebook
Uempty
Note: In earlier versions of HACMP, the terms boot IP label and boot IP address were
used to refer to what is now being called non-service IP label / address. The older terms
still appear in a few places in the HACMP 5.x documentation.
Service interface
A communications interface configured with a service IP label / address (either by alias
or by replacement).
Non-service interface
A communications interface not configured with a service IP label / address. Used as a
backup for a service IP label / address.
2-35
Student Notebook
AU548.0
Notes:
Network configuration rules for heartbeating
The visual shows some of the rules for configuring HACMP IP-based networks. These
are not quite the complete set of rules as we have not had a close enough look at IPAT
yet, and there are a few other issues still to be discussed. In particular, we will discuss
the rules for the service IP addresses later in the unit, when we discuss IPAT.
General rules
The primary purpose of these rules is to ensure that cluster heartbeating can reliably
monitor NICs, networks and nodes.
The two basic approaches are:
- Heartbeating over IP interfaces (the default)
- Heartbeating over IP aliases
V4.0
Student Notebook
Uempty
In either case:
- HACMP requires that each node in the cluster have at least one direct, non-routed
network connection with every other node.
- Between cluster nodes, do not place intelligent switches, routers, or other network
equipment that do not transparently pass through UDP broadcasts and other
packets to all cluster nodes. Bridges, hubs, and other passive devices that do not
modify the packet flow may be safely placed between cluster nodes.
2-37
Student Notebook
netmon
netmon, the network monitor portion of RSCT Topology Services, enables you to create
a configuration file that specifies additional network addresses to which ICMP ECHO
requests can be sent as an additional way to monitor interfaces. netmon is outside the
scope of this class. See the HACMP for AIX: Planning Guide for information on using
netmon.
Unmonitorable NICs
One final point: If no other mechanism has been configured into the cluster, HACMP
attempts to monitor an otherwise unmonitorable NIC by checking to see if packets are
arriving and being sent via the interface. This approach is not sufficiently robust to be
relied upon-- use heartbeating via IP aliases or netmon to get the job done right.
V4.0
Student Notebook
Uempty
IP Address
node1
IP Address
node2
192.168.5.1 192.168.5.2
192.168.6.1 192.168.6.2
Yes
192.168.5.1 192.168.5.2
192.168.5.1 192.168.6.2
192.168.5.1 192.168.5.2
192.168.6.1 192.168.5.3
192.168.5.1 192.168.5.2
192.168.6.1 192.168.6.2
192.168.7.1
192.168.8.1
AU548.0
Notes:
Examples
The visual shows some non-service IP address examples. Well see the service IP
address examples later, when we discuss IPAT.
2-39
Student Notebook
node1
net_rs232_01
node2
net_rs232_02
node3
net_rs232_03
node4
AU548.0
Notes:
Non-ip networks
Non-IP networks are point-to-point; that is, each connection between two nodes is
considered a network and a separate non-IP network label because it is created in
HACMP.
For example, the visual shows four RS232 networks, in a ring configuration, connecting
four nodes to provide full cluster non-IP connectivity.
V4.0
Student Notebook
Uempty
Rules
The rules for non-IP networks are considerably simpler than the rules for IP networks
although they are just as important.
The basic rule is that you must configure enough non-IP networks to provide a non-IP
communication path, possibly via intermediate nodes, between every pair of nodes in
the cluster. In other words, every node must have a non-IP network connection to at
least one other node. Additional communication paths, such as the ring or mesh
topologies discussed in the visual, provide more robustness.
In addition, there are some considerations based on the type of non-IP network you are
using.
2-41
Student Notebook
Disks that are RAID arrays, or subsets of RAID arrays, might have lower limits.
Check with the disk or disk subsystem manufacturer to determine the number of
seeks per second that a disk or disk subsystem can support. However, if you choose
to use a disk that has significant I/O load, increase the value for the timeout
parameter for the disk heartbeat network.
- When SDD is installed and the enhanced concurrent volume group is associated
with an active vpath device, ensure that the disk heartbeating communication device
is defined to use the /dev/vpath device (rather than the associated /dev/hdisk
device).
- If a shared volume group is mirrored, at least one disk in each mirror should be used
for disk heartbeating.
This is particularly important if you plan to set the forced varyon option for a resource
group.
V4.0
Student Notebook
Uempty
n1
lv1
n2
MNDHB 1
MNDHB 2
MNDHB 3
lv2
lv3
n3
AU548.0
Notes:
Fencing
When a cluster partition occurs HACMP will determine the losing side and fence those
nodes away from the shared storage.
Fencing uses the same function as LVM uses when quorum is lost on a mirrored volume
group with quorum on access to the disks is blocked and any further I/O attempts fail.
Note that the VG does not have to be defined with quorum and mirroring.
The losing side is determined by a simple quorum calculation a node must have access
to at least one more than half of the disks.
2-43
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Rationale
In earlier releases of HACMP, the only way to guarantee that a known IP address would
always be available on each node for administrative purposes was to configure a
separate network, which was never used for IPAT. Such a configuration limits the
usefulness of the administrative network because the loss of that network adapter
would result in an inability to reach the node for administrative purposes.
Additionally, there are applications and functions that require a reliable address that is
used to reach a specific node, one that does not move from one node to another. GLVM
is one AIX function that requires an address to be bound to a node but kept highly
available amongst adapters on that network. Applications such as Tivoli Management
Region (TMR), require that a static IP address be assigned to each node it manages.
This is accomplished through a persistent address.
2-45
Student Notebook
Persistent IP labels
As an optional network component, users can configure persistent node IP labels.
These are IP aliases that are configured on a node and kept available as long as at
least one communication interface remains active on the associated network.
Persistent IP labels can be used with IPAT. Persistent IP labels do not move as part of
IPAT from node to node, but will move to another interface on the same node in the
event an adapter failure occurs.
V4.0
Student Notebook
Uempty
3. True or False?
Persistent node IP labels are not supported for IPAT via IP
replacement.
4. True or False?
There are no exceptions to the rule that, on each node, each NIC on
the same LAN must have an IP address in a different subnet.
AU548.0
Notes:
2-47
Student Notebook
V4.0
Student Notebook
Uempty
2-49
Student Notebook
AU548.0
Notes:
Topic 3 objectives
This section explains how to configure both variants of IP Address Takeover.
V4.0
Student Notebook
Uempty
IP Address Takeover
Each highly available application is likely to require its own IP
address (called a service IP address).
This service IP address is placed in the application's resource
group.
mo
un
ex
po
r ts
ts
Vo
G lu m
ro e
up
NF
e
F il e m
st
Sy
NF
Se
I P r v ic e
la b
el
HACMP is responsible for ensuring that the service IP address is available on the node
currently responsible for the resource group.
Ap
ti
p lic a
er
on S
ver
Reso
ur
Grou ce
p
AU548.0
Notes:
Service IP address
Most highly available applications work best, from the users perspective, if the
applications IP address never changes. This capability is provided by HACMP using a
feature called IP Address Takeover. An IP address is selected that is associated with
the application. This IP address is called a service IP address because it is used to
deliver a service to the use. It is placed in the applications resource group. HACMP
then ensures that the service IP address is kept available on whichever node the
resource group is currently on. The process of moving an IP address to another NIC or
to a NIC on another node is called IP address takeover (IPAT).
2-51
Student Notebook
for which the client software can be configured to check multiple IP addresses when it is
looking for the server.
Also, IPAT is not supported for resource groups configured with a Startup Policy of
Online on All Node (concurrent access) because the application in such a resource
group is active on all the nodes that are currently up. Consequently, clients of a
concurrent access resource group must be capable of finding their server by checking
multiple IP addresses.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
IPAT via IP aliasing
IPAT via IP aliasing takes advantage of AIXs ability to have multiple IP addresses
associated with a single NIC. This ability, called IP aliasing, allows HACMP to move
service IP addresses between NICs (or between nodes) without having to either change
existing IP addresses on NICs or worry about whether or not there is already a service
IP label on the NIC.
2-53
Student Notebook
Which is better?
We will examine the advantages and disadvantages of each method in the next few
pages. Remember that the question is not which is better but rather which is better
suited to a particular context.
V4.0
Student Notebook
Uempty
192.168.11.1 (ODM)
192.168.10.2 (ODM)
192.168.11.2 (ODM)
* Refer to earlier discussion of heartbeating and failure diagnosis for explanation of why
Copyright IBM Corporation 2008
AU548.0
Notes:
Requirements
Before configuring an HACMP network to use IPAT via IP aliasing, ensure that:
- The network is a type that supports IPAT via IP aliasing:
Ethernet
Token-ring
FDDI
SP switch
2-55
Student Notebook
subnet
192.168.10/24
192.168.11/24
9.47.87/24
9.47.88/24
NIC
en0
en1
en0
en1
IP Label
n1boot1
n1boot2
n2boot1
n2boot2
appA-svc
appB-svc
IP Address
192.168.10.1
192.168.11.1
192.168.10.2
192.168.11.2
9.47.87.22
9.47.88.22
IP labels
n1boot1, n2boot1
n1boot2, n2boot2
appA-svc
appB-svc
Planning considerations
A node on a network that uses IPAT via aliasing can be the primary node for multiple
resource groups on the same network, regardless of the number of actual boot
interfaces on the node. Still, users should plan their networks carefully to balance the
RG load across the cluster.
V4.0
Student Notebook
Uempty
Consequently, any load balancing is the responsibility of the cluster administrator (and
will require customization, which is beyond the scope of this course).
2-57
Student Notebook
9.47.87.22 (alias)
192.168.10.1 (ODM)
192.168.11.1 (ODM)
192.168.10.2 (ODM)
192.168.11.2 (ODM)
AU548.0
Notes:
Operation
HACMP uses AIXs IP aliasing capability to alias service IP labels included in resource
groups onto interfaces (NICs) on the node that runs the resource group. With aliasing,
the non-service IP address (stored in the ODM) is still present.
Note that one advantage of sorts of IPAT via IP aliasing is that the non-service IP
addresses do not need to be routable from the client/user systems.
V4.0
Student Notebook
Uempty
192.168.10.1 (ODM)
9.47.87.22 (alias)
192.168.11.1 (ODM)
192.168.10.2 (ODM)
192.168.11.2 (ODM)
AU548.0
Notes:
Interface failure
If a communication interface fails, HACMP moves the service IP addresses to another
communication interface, which is still available, on the same network. If no remaining
available NICs are on the node for the network, then HACMP initiates a fallover for that
resource group.
2-59
Student Notebook
Users perspective
Because existing TCP/IP sessions generally recover cleanly from this sort of
failure/move-IP-address operation, users might not even notice the outage if they are
not interacting with the application at the time of the failure.
V4.0
Student Notebook
Uempty
9.47.87.22 (alias)
192.168.10.2 (ODM)
192.168.11.2 (ODM)
AU548.0
Notes:
Node failure with IPAT
When a node that is running an IPAT-enabled resource group fails, HACMP moves the
resource group to an alternative node. Because the service IP address is in the
resource group, it moves with the rest of the resources to the new node. The service IP
address is aliased onto an available (currently functional) communication interface on
the takeover node.
2-61
Student Notebook
Four choices:
Anti-Collocation
Collocation
Collocation with Persistent Label
Anti-Collocation with Persistent Label
Figure 2-34. IPAT via IP aliasing: Distribution preference for service IP label aliases
AU548.0
Notes:
Distribution preference for service IP label aliases
You can configure a distribution preference for the placement of service IP labels that
are configured in HACMP. Starting with HACMP 5.1, HACMP lets you specify the
distribution preference for the service IP label aliases.
A distribution preference for service IP label aliases is a network-wide attribute used to
control the placement of the service IP label aliases on the communication interfaces on
the nodes in the cluster. Configuring a distribution preference for service IP label aliases
provides:
- Load balancing:
Enables you to customize the load balancing for service IP labels in the cluster,
taking into account the persistent IP labels previously assigned on the nodes.
V4.0
Student Notebook
Uempty
- VPN requirements:
Enables you to configure the type of the distribution preference suitable for the VPN
firewall external connectivity requirements.
HACMP will try to meet preferences, but will always keep service labels active:
The distribution preference is exercised as long as there are acceptable network
interfaces available. However, HACMP always keeps service IP labels active, even if
the preference cannot be satisfied.
How to configure
You use the extended path to configure distribution preferences. Follow this path:
smitty hacmp -> Extended Configuration -> Extended Resource Configuration ->
HACMP Extended Resources Configuration -> Configure Resource Distribution
Preferences -> Configure Service IP Labels/Address Distribution Preference -> pick
your network -> toggle through the Distribution Preference menu options.
2-63
Student Notebook
AU548.0
Notes:
Summary
The visual summarizes IPAT via IP aliasing. Some additional considerations are
discussed as follows.
Advantages
Probably the most significant advantage to IPAT via IP aliasing is that it supports
multiple service IP labels per network per resource group on the same communication
interface and allows a node to easily support quite a few resource groups. In other
words, IPAT enables you to share several service labels on one interface. Thus, it can
require fewer adapters and interfaces than IPAT via replacement.
V4.0
Student Notebook
Uempty
Disadvantages
Probably the most significant disadvantage is that IPAT via IP aliasing does not support
hardware address takeover. You will rely on Gratuitous ARP as the means of resetting
the ARP entries on IPAT.
In addition, because you must have a subnet for each interface and a subnet for each
service IP label, IPAT via IP aliasing can require a lot of subnets.
2-65
Student Notebook
Advantages
Supports hardware address takeover
Requires fewer subnets
Disadvantages
Requires more interfaces to support multiple service IP labels
Is less flexible
Copyright IBM Corporation 2008
AU548.0
Notes:
History
In the beginning, IPAT via IP replacement was the only form of IPAT available. IPAT via
IP aliasing became available when AIX could support multiple IP addresses associated
with a single NIC via IP aliasing. Because IPAT via IP aliasing is more flexible and
usually requires less network interface cards, IPAT via IP replacement is no longer the
recommended method. Many existing cluster implementations still have IPAT via
Replacement. When upgrading to versions of HACMP that support IPAT via Aliasing,
consider converting IPAT via Replacement configurations to Aliasing only if there is
another reason compelling you to do so. Otherwise, leave the IPAT via Replacement
configuration as it is. Any new implementations should strongly consider using IPAT via
Aliasing.
This visual gives a brief overview of IPAT via IP replacement. A detailed discussion can
be found in Appendix C.
V4.0
Student Notebook
Uempty
Configuration rules
The visual summarizes the configuration rules. Notice that they are almost the opposite
to the rules for IPAT via IP aliasing.
Advantages
Probably the most significant advantage of IPAT via IP replacement is that it supports
hardware address takeover (HWAT). HWAT may be needed if your local clients or
routers do not support gratuitous ARP. This will be discussed in a few pages.
Another advantage is that it requires fewer subnets. If you are limited in the number of
subnets available for your cluster, this may be important.
Note: If reducing the number of subnets needed is important, another alternative may
be to use heartbeating via aliasing, see Heartbeating over IP aliases on page 2-37.
Disadvantages
Probably the most significant disadvantages are that IPAT via IP replacement limits the
number of service IP labels per subnet per resource group on one communications
interface to one and makes it rather expensive (and complex) to support lots of
resource groups in a small cluster. In other words, you need more network adapters to
support more applications.
2-67
Student Notebook
192.168.7.1
192.168.183.57
198.161.22.1
192.168.8.1
192.168.183.57
198.161.22.1
192.168.5.98
192.168.6.171
192.168.7.1
192.168.183.57
198.161.22.1
192.168.5.2
192.168.6.2
192.168.7.2
192.168.4.1
192.168.10.1
192.168.183.57
198.161.22.1
IP addresses on
first node
IP addresses on
second node
192.168.5.1
192.168.6.1
192.168.5.2
192.168.6.2
192.168.5.1
192.168.6.1
192.168.5.2
192.168.7.1
192.168.5.1
192.168.6.14
192.168.5.1
192.168.6.1
192.168.7.1
192.168.8.1
102.168.9.1
192.168.5.3
192.168.5.97
AU548.0
Notes:
Service IP address rules and examples
The rules for service IP addresses are straight-forward. It comes down to what subnet
the service IP address can be in.
For IPAT via Replacement, the service IP addresses must be in a subnet that is the
same as one of the non-service IP address subnets.
For IPAT via Aliasing, the service IP addresses must be in a subnet that is different
than the non-service IP address subnets.
The table above provides some examples. Notice that for a given set of IP addresses
on the interfaces (AIX ODM), service IP labels which are acceptable for IPAT via IP
aliasing are not acceptable for IPAT via replacement and vice-versa. Also notice that the
IPAT via Replacement column only contains subnets that are the same as the subnets
in the first two columns, while the IPAT via Aliasing column contains only subnets that
are different than the subnets in the first two columns.
2-68 HACMP Implementation
V4.0
Student Notebook
Uempty
Persistent IP labels should include the node name (because they will not
be moved to another node) and should identify that they are persistent:
usa-per, uk-per, node1adm, usaadmin,
Why?
Conventions prevent mistakes
Preventing mistakes improves availability!
AU548.0
Notes:
Using IP labeling and naming conventions
Again, the purpose of HACMP is to create a highly available environment for your
applications. A naming convention can make it easier for humans to understand the
configuration. This can reduce mistakes, leading to better availability.
Never underestimate the value of a consistent labeling or naming convention. It can
prevent mistakes which can, in turn, prevent outages.
2-69
Student Notebook
Hostname resolution
All of the cluster's IP labels must be defined in every cluster
node's /etc/hosts file:
127.0.0.1 loopback localhost
# cluster explorers
# netmask 255.255.255.0
# uk boot addresses
192.168.15.31
ukboot1
192.168.16.31
ukboot2
# uk boot addresses
192.168.15.31
ukboot1
192.168.16.31
ukboot2
# persistent IP labels
192.168.5.29
usa-per
192.168.5.31
uk-per
# persistent IP labels
192.168.5.29
usa-per
192.168.5.31
uk-per
# Service IP labels
192.168.15.92
xweb-svc
192.168.15.70
yweb-svc
# Service IP labels
192.168.5.92
xweb-svc
192.168.5.70
yweb-svc
AU548.0
Notes:
/etc/hosts
Make sure that the /etc/hosts file on each cluster node contain all of the IP labels used
by the cluster (you do not want HACMP to be in a position where it must rely on an
external DNS server to do IP label to address mappings).
V4.0
Student Notebook
Uempty
NSORDER = local
As a result, the /etc/hosts file of each cluster node must contain all HACMP-defined IP
labels for all cluster nodes.
Maintaining /etc/hosts
The easiest way to ensure that all of the /etc/hosts file contain all of the required
addresses is to get one /etc/hosts file set up correctly and then copy it to all of the other
nodes or use the filecollections facility of HACMP 5.x.
2-71
Student Notebook
en5
en4
en5
sw1
sw2
en0 en1 en2 en3
Shared Storage
Heartbeat on disk
appB 192.168.3.20
appA 192.168.3.10
AU548.0
Notes:
Etherchannel details
Etherchannel is a trunking technology that allows grouping several Ethernet links.
Traffic is distributed across the links, providing higher performance and redundant
parallel paths. When a link fails, traffic is redirected to the remaining links within the
channel without user intervention and with minimal packet loss.
EtherChannel was invented by Kalpana in the early 1990s and bought by CISCO in
1994. Other popular trunking technologies exist: Adaptec's Duralink trunking / Nortel
MLT MultiLink Trunking.
Interoperability between technologies is a problem.
A standard IEEE 802.3ad was finalized in 2000.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Very useful information can be found in the following documents (although dated, the
information is still very relevant).
A Techdoc regarding experiences and configuration:
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101785
2-73
Student Notebook
Frame1
ent3
(LA)
ent4
(SEA)
ent1 ent0
(phy) (phy)
ent2
(virt)
en0
Control
Channel
Control
Channel
ent5
(virt)
ent0
(virt)
ent5
(virt)
ent4
(SEA)
ent3
(LA)
ent2
(virt)
ent1 ent0
(phy) (phy)
Hypervisor
Ethernet Switch
Ethernet Switch
Hypervisor
Frame2
ent1 ent0
(phy) (phy)
ent2
(virt)
ent3
(LA)
ent4
(SEA)
ent5
(virt)
ent0
(virt)
Control
Channel
en0
ent5
(virt)
Control
Channel
ent2
(virt)
ent1 ent0
(phy) (phy)
ent4
(SEA)
ent3
(LA)
AU548.0
Notes:
Where to get more information
To get more information on this configuration, consult the following resources:
Redbooks:
Implementing HACMP Cookbook
http://publib-b.boulder.ibm.com/abstracts/sg246769.html?Open
Advanced POWER Virtualization on IBM System p5: Introduction and Configuration
http://www.redbooks.ibm.com/abstracts/sg247940.html?Open
Other education:
AU620, HACMP System Administration III: Virtualization and Disaster Recovery
V4.0
Student Notebook
Uempty
9.19.51.20
(service IP)
9.19.51.10 (persistent IP)
192.168.100.1
( base address)
(service IP)
9.19.51.21
(persistent IP) 9.19.51.11
Topsvcs heartbeating
192.168.100.2
( base address)
serial_net1
en0
en0
HACMP Node 1
HACMP Node 2
FRAME 2
FRAME 1
Hypervisor
ent1 ent0
(phy) (phy)
ent2
(virt)
ent3
(LA)
ent4
(SEA)
FRAME X
ent5
(virt)
ent0
(virt)
Control
Channel
en0
ent5
(virt)
Control
Channel
ent2
(virt)
ent4
(SEA)
ent1 ent0
(phy) (phy)
ent3
(LA)
AU548.0
Notes:
Additional information
Single adapter Ethernet networks in HACMP require the use of a netmon.cf file.
Note that there does not have to be link aggregation at the VIO Server level.
You could configure a single NIC and rely on the other VIO Server for redundancy.
2-75
Student Notebook
AU548.0
Notes:
Single IP adapter nodes
It is not unusual for a customer to try to implement an HACMP cluster in which one or
more of the cluster nodes have only a single network adapter (the motivation is usually
the cost of the adapter but the additional cost of a backup system with enough PCI slots
for the second adapter can also be the issue).
The situation is actually, quite simple: with the exception of virtual ethernet
implementations and certain Cluster 1600 clusters that use the SP Switch facility, any
cluster with only one NIC on a node for a given network has a single point of failure, the
solitary NIC, and is not supported.
Nodes with only a single NIC on an IP network are, at best, a false economy. At worst,
they are a fiasco waiting to happen as the lack of a second NIC on one or more of the
nodes could lead to extended cluster outages and just generally strange behavior
(including HACMP failing to detect failures which would have been detected had all
nodes had at least two NICs per IP network).
2-76 HACMP Implementation
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Getting IP addresses and subnets
Unless you happen to be the network administrator (in which case, you can feel free to
spend time talking to yourself), you need to get the network administrator to provide you
with IP addresses for your cluster. The requirements imposed by HACMP on IP
addresses are rather unusual and might surprise your network administrator; so be
prepared to explain both what you want and why you want it. Also, ask for what you
want well in advance of the date that you need it because it might take some time for
the network administrator to find addresses and subnets for you that meet your needs.
Do not accept IP addresses that do not meet the HACMP configuration rules. Even if
you can get them to appear to work, they almost certainly will not work at a point in time
when you can least afford a problem.
2-77
Student Notebook
/etc/inittab
/sbin/rc.boot
cfgmgr
/etc/rc.net (modified for ipat)
exit 0
/etc/rc
mount all
/usr/sbin/cluster/etc/harc.net
/etc/rc.net -boot
cfgif
< Cluster Services startup > clstrmgr
event node_up
node_up_local
get_disk_vg_fs
acquire_service_addr
telinit -a
/etc/rc.tcpip
daemons start
/etc/rc.nfs
daemons start
exportfs
AU548.0
Notes:
/etc/inittab changes
A node with a network configured for IPAT must not start inetd until HACMP has had a
chance to assign the appropriate IP addresses to the nodes interfaces. Consequently,
the AIX start sequence is modified slightly if a node has a resource group that uses
either form of IPAT.
V4.0
Student Notebook
Uempty
Changes to /etc/inittab
init:2:initdefault:
brc::sysinit:/sbin/rc.boot 3 >/dev/console 2>&1 # Phase 3 of system boot
. . .
srcmstr:23456789:respawn:/usr/sbin/srcmstr # System Resource Controller
harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP for AIX network startup
rctcpip:a:wait:/etc/rc.tcpip > /dev/console 2>&1 # Start TCP/IP daemons
rcnfs:a:wait:/etc/rc.nfs > /dev/console 2>&1 # Start NFS Daemons
. . .
qdaemon:a:wait:/usr/bin/startsrc -sqdaemon
writesrv:a:wait:/usr/bin/startsrc -swritesrv
. . .
ctrmc:2:once:/usr/bin/startsrc -s ctrmc > /dev/console 2>&1
ha_star:h2:once:/etc/rc.ha_star >/dev/console 2>&1
dt:2:wait:/etc/rc.dt
cons:0123456789:respawn:/usr/sbin/getty /dev/console
xfs:0123456789:once:/usr/lpp/X11/bin/xfs
hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1
clinit:a:wait:/bin/touch /usr/es/sbin/cluster/.telinit # HACMP for AIX These must be the last
entries of run level a in inittab!
pst_clinit:a:wait:/bin/echo Created /usr/es/sbin/cluster/.telinit > /dev/console # HACMP for
AIX These must be the last entries of run level a in inittab!
AU548.0
Notes:
HACMP 5.x changes to /etc/inittab
The visual shows excerpts from /etc/inittab from a system running AIX 6.1 and HACMP
5.4.1.
HACMP 5.1 added the harc entry to the /etc/inittab file, that runs harc.net to
configure the network interfaces. Also, starting in HACMP 5.1, some of the other inittab
entries have been changed to run in run-level a. These are invoked by HACMP when it
is ready for the TCP/IP daemons to run. The final two lines use the touch command to
create a marker file when all of the run-level a items have been run. HACMP waits for
this marker file to exist so that it knows when the run-level a items have been
completed.
HACMP 5.3 made some additional changes to the inittab file. In HACMP 5.3 and later,
the HACMP daemons are running all the time, even before you start the cluster. These
daemons are started by the ha_star and hacmp entries in the inittab file.
2-79
Student Notebook
AU548.0
Notes:
Configuration problems
The visual shows some common IP configuration errors to watch out for.
V4.0
Student Notebook
Uempty
2. True or False?
All networking technologies supported by HACMP support IPAT via IP aliasing.
3. True or False?
All networking technologies supported by HACMP support IPAT via IP replacement.
4. If the left node has NICs with the IP addresses 192.168.20.1 and 192.168.21.1 and the
right hand node has NICs with the IP addresses 192.168.20.2 and 192.168.21.2, then
which of the following options are valid service IP addresses if IPAT via IP aliasing is
being used? (Select all that apply.)
a.(192.168.20.3 and 192.168.20.4) or (192.168.21.3 and 192.168.21.4)
b.192.168.20.3 and 192.168.20.4 and 192.168.21.3 and 192.168.21.4
c. 192.168.22.3 and 192.168.22.4
d.192.168.23.3 and 192.168.24.3
5. If the left node has NICs with the IP addresses 192.168.20.1 and 192.168.21.1 and the
right hand node has NICs with the IP addresses 192.168.20.2 and 192.168.21.2, then
which of the following options are valid service IP addresses if IPAT via IP replacement is
being used? (Select all that apply.)
a.(192.168.20.3 and 192.168.20.4) or (192.168.21.3 and 192.168.21.4)
b.192.168.20.3, 192.168.20.4, 192.168.21.3 and 192.168.21.4
c. 192.168.22.3 and 192.168.22.4
d.192.168.23.3 and 192.168.24.3
Copyright IBM Corporation 2008
AU548.0
Notes:
2-81
Student Notebook
V4.0
Student Notebook
Uempty
2-83
Student Notebook
AU548.0
Notes:
Topic 4 objectives
This section looks at the impact of IPAT on client systems.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
What users see
Users who are actively using the clusters services at the time of a failure will notice an
outage while HACMP detects, diagnoses and recovers from the failure.
ii. How long it takes HACMP to diagnose the failure (determine what failed)
iii. How long it takes HACMP to recover from the failure
The first two of these generally takes between about five and about thirty seconds
depending on the exact failure involved. The third component can take another dozen
2-85
Student Notebook
or so seconds when moving an IP address within a node or it can take a few minutes or
more in the case of a fallover.
V4.0
Student Notebook
Uempty
192.168.11.1 (ODM)
00:04:ac:48:22:f4
192.168.10.1 (ODM)
00:04:ac:62:72:49
AU548.0
Notes:
ARP cache issues
Client systems that are located on the same physical network as the cluster might find
that their ARP cache entries are obsolete after an IP address moves to another NIC (on
the same node or on a different node).
The ARP cache is a table of IP addresses and the network hardware addresses (MAC
addresses) of the physical network cards that the IP addresses are assigned to. When
an IP address moves to a different physical network card, the clients ARP cache might
still have the old MAC address. It could take the client system a few minutes to realize
that its ARP cache is out-of-date and ask for an updated MAC address for the servers
IP address.
2-87
Student Notebook
ARP:
router (192.168.8.1) 00:04:ac:42:9c:e2
192.168.8.3
00:04:ac:27:18:09
ARP:
xweb (192.168.5.1) 00:04:ac:62:72:49
client (192.168.8.3) 00:04:ac:27:18:09
192.168.8.3
00:04:ac:27:18:09
192.168.8.1
00:04:ac:42:9c:e2
192.168.5.99
00:04:ac:29:31:37
xweb 192.168.5.1 (alias)
192.168.10.1 (ODM)
192.168.11.1 (ODM)
00:04:ac:62:72:49
00:04:ac:48:22:f4
ARP:
xweb (192.168.5.1) ???
client (192.168.8.3) 00:04:ac:27:18:09
192.168.8.1
00:04:ac:42:9c:e2
192.168.8.99
00:04:ac:29:31:37
192.168.5.1 (alias) xweb
192.168.10.1 (ODM)
192.168.11.1 (ODM)
00:04:ac:62:72:49
00:04:ac:48:22:f4
AU548.0
Notes:
ARP cache entries are always local
ARP cache entries are only maintained by a system for the physical network cards that
it communicates with directly. If there is a router between the client system and the
cluster, then the client systems ARP cache has entry for the IP address and MAC
address for the routers network interface located on the clients side of the router. No
amount of IP address moves or node fallovers have any (positive or negative) impact on
what needs to be in the clients ARP cache.
Rather, it is the ARP cache entries for the router, that is on the clusters network which
must be up-to-date.
Most clusters have either a small handful or no client systems on the same physical
network as the cluster. Consequently, whatever ARP cache issues might exist in a
particular configuration, they do not usually affect very many systems. Its the ARP
cache entries of the routers that must be considered.
V4.0
Student Notebook
Uempty
Gratuitous ARP
AIX supports a feature called gratuitous ARP.
AIX sends out a gratuitous (that is, unrequested) ARP update whenever an IP address is
set or changed on a NIC.
AU548.0
Notes:
Gratuitous ARP
AIX supports a feature called gratuitous ARP. Whenever an IP address associated with
a NIC changes, AIX broadcasts out a gratuitous (in other words, unsolicited) ARP
update. This gratuitous ARP packet is generally received and used by all systems on
the clusters local physical network to update their ARP cache entries.
The result is that all relevant ARP caches are updated almost immediately after the IP
address is assigned to the NIC.
The problem is that not all systems respond or even necessarily receive these
gratuitous ARP cache update packets. If a local system either does not receive or
ignores the gratuitous ARP cache packet then its ARP cache remains out-of-date.
Note that unless the network is very overloaded, local systems generally either always
or never act upon the gratuitous ARP update packet.
2-89
Student Notebook
AU548.0
Notes:
Gratuitous ARP issues
Not all network technologies provide the appropriate capabilities to implement
gratuitous ARP. In addition, operating systems that implement TCP/IP are not required
to respect gratuitous ARP packets (although practically all modern operating systems
do).
Finally, support issues aside, an extremely overloaded network or a network that is
suffering intermittent failures might result in gratuitous ARP packets being lost. (A
network that is sufficiently overloaded to be losing gratuitous ARP packets or that is
suffering intermittent failures that result in gratuitous ARP packets being lost, is likely to
be causing the cluster and the cluster administrator far more serious problems than the
ARP cache issue involves.)
V4.0
Student Notebook
Uempty
Suggestion:
Do not get involved with using either clinfo or HWAT to deal with
ARP cache issues until you have verified that there actually are
ARP issues that need to be dealt with.
AU548.0
Notes:
If gratuitous ARP is not supported
HACMP supports three alternatives to gratuitous ARP. We will discuss these in the next
few pages.
2-91
Student Notebook
192.168.10.1 (boot)
00:04:ac:62:72:49
snmpd
clinfo
clstrmgr
clinfo.rc
AU548.0
Notes:
clinfo on the client
The cluster information service may be run on any client system. clinfo can execute a
script that flushes the local ARP cache and pings the servers following failure. clinfo
can detect failure either by polling or receiving SNMP traps from within the cluster.
The clinfo source code is provided with HACMP so that it can, at least in theory, be
ported to non-AIX client operating systems.
V4.0
Student Notebook
Uempty
ping!
snmpd
clinfo
clstrmgr
clinfo.rc
AU548.0
Notes:
clinfo on the cluster nodes
clinfo is already compiled and ready to run on the clusters servers. Once again
clinfo can execute a script on the servers that flushes the local ARP cache and pings
the local clients. These in-bound ping packets contain the new IP address-to-MAC
address relationship, and are used by the client operating system to update its ARP
cache. Unfortunately, this is not a mandatory feature of TCP/IP; so its possible
(although rather unusual) that a client operating system might fail to update its ARP
cache when the ping packet arrives.
2-93
Student Notebook
AU548.0
Notes:
clinfo.rc
The clinfo.rc script must be edited manually on the cluster nodes that run clinfo.
There is no reason why clinfo cannot also be run on the client systems; although,
these changes are only required on the cluster nodes that are running clinfo.rc.
Remember: All the cluster nodes should be running clinfo if clinfo is being used
within the cluster, to deal with ARP cache issues (because you never know which
cluster nodes will survive whatever has gone wrong).
Edit the /usr/es/sbin/cluster/etc/clinfo.rc file on each server node. Add the IP
label or IP address of each system that accesses service IP addresses managed by
HACMP to the PING_CLIENT_LIST list. Then start the clinfo daemon (clinfo can be
started as part of starting Cluster Services on the cluster nodes).
V4.0
Student Notebook
Uempty
/etc/cluster/ping_client_list
You can also provide the list of clients to be pinged in the file
/etc/cluster/ping_client_list. This is probably the best method as it ensures that the
list of clients to ping is not overlaid by future changes to clinfo.rc.
More details
This script is invoked by HACMP as follows:
clinfo.rc {join,fail,swap} interface_name
The next set of details likely do not make sense until we are further into the course.
When clinfo is notified that the cluster is stable after undergoing a failure recovery of
some sort or when clinfo first connects to clsmuxpd (the SNMP part of HACMP), it
receives a new map (description of the clusters state). It checks for changed states of
interfaces:
- If a new state is UP, clinfo calls clinfo.rc join interface_name.
- If a new state is DOWN, clinfo calls clinfo.rc fail interface_name.
- If clinfo receives a node_down_complete event, it calls clinfo.rc with the fail
parameter for each interface currently UP.
- If clinfo receives a fail_network_complete event, it calls clinfo.rc with the
fail parameter for all associated interfaces.
- If clinfo receives a swap_complete event, it calls clinfo.rc swap
interface_name.
2-95
Student Notebook
AU548.0
Notes:
Hardware address takeover
Hardware Address Takeover is the most robust method of dealing with the ARP cache
issue as it ensures that the hardware address associated with the service IP address
does not change (which avoids the whole issue of whether the client systems ARP
cache is out-of-date).
The essence of HWAT is that the cluster configurator designates a hardware address,
that is to be associated with a particular service IP address. HACMP then ensures that
whichever NIC the service IP address is on also has the designated hardware address.
HWAT is discussed in detail in Appendix C.
V4.0
Student Notebook
Uempty
HWAT considerations
Remember the following points when contemplating HWAT:
- The hardware address that is associated with the service IP address must be unique
within the physical network that the service IP address is configured for.
- HWAT is not supported by IPAT via IP aliasing because each NIC can have more
than one IP address but each NIC can only have one hardware address.
- HWAT is only supported for Ethernet, token ring, and FDDI networks (MCA FDDI
network cards do not support HWAT). ATM networks do not support HWAT.
- HWAT increases the takeover time (usually by just a few seconds).
- HWAT is an optional capability that must be configured into the HACMP cluster. (We
will see how to do that in detail in a later unit.)
- Cluster nodes using HWAT on token ring networks must be configured to reboot
after a system crash as the token ring card will continue to intercept packets for its
hardware address until the node starts to reboot.
2-97
Student Notebook
Checkpoint
1. True or False?
Clients are required to exit and restart their application after a
fallover.
2. True or False?
All client systems are potentially directly affected by the ARP cache
issue.
3. True or False?
clinfo must not be run both on the cluster nodes and on the
client systems.
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary (1 of 2)
Key points from this unit:
HACMP uses networks to:
Provide highly available client access to applications in the cluster
Detect and diagnose NIC, node, and network failures using RSCT heartbeats
Communicate with HACMP daemons on other nodes
AU548.0
Notes:
2-99
Student Notebook
Unit summary (2 of 2)
Key points from this unit (continued):
HACMP has very specific requirements for subnets.
IPAT via aliasing
NICs on a node must be on different subnets, which must use the same subnet
mask.
There must be at least one subnet in common with all nodes.
Service addresses must be on different subnet than any non-service address.
A service address can be on same subnet with another service address.
IPAT via replacement
NICs on a node must be on different subnets; which must use the same subnet
mask.
Each service address must be in same subnet as one of the non-service addresses
on the highest priority node.
Multiple service addresses must be in the same subnet.
Heartbeating over IP alias (any form of IPAT)
Service and non-service addresses can coexist on the same subnet, or be on
separate subnets.
One subnet required for heartbeating; does not need to be routed.
AU548.0
Notes:
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
http://www-03.ibm.com/servers/storage
http://www.redbooks.ibm.com
3-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Discuss the shared storage concepts that apply within an
HACMP cluster
Describe the capabilities of various disk technologies as they
related to HACMP clusters
Describe the shared storage related facilities of AIX and how
to use them in an HACMP cluster
AU548.0
Notes:
3-2
HACMP Implementation
V4.0
Student Notebook
Uempty
3-3
Student Notebook
AU548.0
Notes:
3-4
HACMP Implementation
V4.0
Student Notebook
Uempty
Virtual
SCSI
Node
2
disks
SAN
storage
via
VIO Server
rootvg
rootvg
rootvg
rootvg
AU548.0
Notes:
Application storage requirements
A computer application always requires at least a certain amount of disk storage space.
For example, even the most minimal application requires disk space to store the
applications binaries. Most applications also require storage space for configuration
files and whatever application data the application is responsible for.
When such an application is placed into a high-availability cluster, any of the
applications data that changes must be stored in a location that is accessible to
whichever node the application is currently running on. This storage that is accessible
to multiple nodes is called shared storage.
Also keep in mind that HACMP does not provide data redundancy. Data must be striped
or mirrored across multiple physical drives (generally presented to AIX as a LUN) and
access to those LUNs from each node should be over multiple paths (generally referred
to as multi-pathing). This most likely will result in the use of a shared storage device that
provides the striping or mirroring and multi-pathing software. These components must
Copyright IBM Corp. 1998, 2008
3-5
Student Notebook
be checked for compatibility with HACMP at the level you intend to implement, both AIX
and HACMP levels.
Non-concurrent access
In a non-concurrent access environment, the disks are owned by only one node at a
time. If the owning node fails, the cluster node with the next highest priority in the
resource group node list acquires ownership of the shared disks as part of fallover
processing. This ensures that the data stored on the disks remains accessible to client
applications.
In a non-concurrent access environment, a highly available application potentially runs
on only one node for extended periods of time. Only one disk connection is active at a
time and the shared storage is not shared in any real time sense. Rather, it is storage
that can be associated automatically (without human intervention) with the node where
the application is currently running. Non-concurrent access mode is sometimes called
serial access mode, because only one node has access to the shared storage at a time.
We will focus on non-concurrent shared storage in this unit.
Concurrent access
In concurrent access environments, the shared disks are activated on more than one
node simultaneously. Therefore, when a node fails, disk takeover is not required. In this
case, access to the shared storage must be controlled by some locking mechanism in
the application.
3-6
HACMP Implementation
V4.0
Student Notebook
Uempty
Virtual
SCSI
Node
2
disks
SAN
storage
via
VIO Server
rootvg
rootvg
rootvg
rootvg
AU548.0
Notes:
Private storage
Private storage is, of course, accessible to only a single cluster node. It might be
physically located within each systems box or externally in a rack or even in an external
storage subsystem. The key point is that private storage is not physically accessible
from more than one cluster node.
3-7
Student Notebook
AU548.0
Notes:
Why?
The shared storage is physically connected to each node that the application might run
on. In a non-concurrent access environment, the application actually runs on only one
node at a time and modification or even access to the data from any other node during
this time could be catastrophic (the data could be corrupted in ways which take days or
even weeks to notice).
3-8
HACMP Implementation
V4.0
Student Notebook
Uempty
Node
2
ODM
ODM
ODM
ODM
AU548.0
Notes:
Introduction
There are two mechanisms to control ownership of shared storage. Although these two
mechanisms do not seem to have formal names, in this unit, we refer to them as the:
- Reserve/release-based shared storage protection mechanism and the
- RSCT-based shared storage protection mechanism
We use the term protection rather than access control both because it is a bit shorter
and because it reminds us that the purpose of the mechanism is to protect the shared
storage.
3-9
Student Notebook
V4.0
Student Notebook
Uempty
Reserve/release-based protection
A
Node
1
B varyonvg
ODM
ODM
Node
2
ODM
varyonvg
AU548.0
Notes:
Disk reservation
Reserve/release-based shared storage protection relies on the disk technology
supporting a mechanism called disk reservation. Disks which support this mechanism
can be, in effect, told to refuse to accept almost all commands from any node other than
the one which issued the reservation. AIXs LVM automatically issues a reservation
request for each disk in a volume group when the volume group is varied online by the
varyonvg command. The varyonvg command fails for any disks that are currently
reserved by other nodes. If it fails for enough disks, that it almost certainly does since if
one disk is reserved by another node, the others presumably are also, then the varyon
of the volume group fails.
3-11
Student Notebook
must be some mechanism to ensure that any meta-data VGDA changes made to the
volume group on the active node will be updated in the ODM on the inactive nodes in
the cluster. For example, if you change the size of a logical volume on the active node,
the other nodes ODMs will still list the logical volume at the original size. When an
inactive node is made active and if the volume group were varied on without updating
the ODM, the information in the ODM on the node and the VGDA on the disks would
disagree. This will cause problems.
When using reserve/release-based shared storage protection, HACMP provides a
last-chance mechanism called lazy update to update the ODM on the takeover node at
the time of fallover. This is meant to be a final attempt at synchronizing the VGDA
content with a takeover nodes ODM at fallover time. For obvious reasons (like the fact
that it cant overcome some VGDA/ODM mismatches) relying on lazy update should be
avoided.
Lazy update
Lazy update works by using the volume group timestamp in the ODM. When HACMP
needs to varyon a volume group, it compares the ODM timestamp to the timestamp in
the VGDA. If the timestamps disagree, lazy update does an exportvg/importvg to
recreate the ODM on the node. If the timestamps agree, no extra steps are required.
It is, of course, possible to update the ODM on inactive nodes when the change to the
VGDA meta-data is made. In this way, extra time at fallover is avoided. The ODM can
be updated manually or you can use Cluster Single Point of Control (C-SPOC), which
can automate this task. Lazy update and the various options for updating ODM
information on inactive nodes are discussed in detail in a later unit in this course.
V4.0
Student Notebook
Uempty
httpvg
varyonvg
ODM
ODM
ODM
dbvg
C
varyonvg
Node
1
Node
2
ODM
ODM
Node
1
Node
2
ODM
ODM
dbvg
C
varyonvg
httpvg
A
varyonvg
Node
2
ODM
ODM
ODM
ODM
dbvg
C
varyonvg
Node2:
varyoffvg httpvg
Node1:
varyonvg httpvg
AU548.0
Notes:
Manual takeover
With reserve/release-based shared storage protection, HACMP passes volume groups
between nodes by issuing a varyoffvg command on one node and a varyonvg
command on the other node. The coordination of these commands (ensuring that the
varyoffvg is performed before the varyonvg) is the responsibility of HACMP.
3-13
Student Notebook
Node
1
B varyonvg
ODM
ODM
ODM
ODM
varyonvg
Node
1
Node
2
varyonvg
ODM
Node
2
ODM
ODM
varyonvg
AU548.0
Notes:
Disk takeover due to a failure
The right node has failed with the shared disks still reserved to the right node. When
HACMP encounters a reserved disk in this context, it uses a special utility program to
break the disk reservation. It then varies on the volume group which causes the disks to
be reserved to the takeover node.
Implications
Note that if the right node had not really failed then it would lose its reserves on the
shared disks (rather abruptly) when the left node varied them on. This will be seen in
the left nodes error log and should be acted on immediately, because this indicates you
are in a situation where both nodes can access and update the data on the disks (each
believing that it is the only node accessing and updating the data). An failure takeover
isnt possible unless all paths used by HACMP to communicate between the two nodes
have been severed.
3-14 HACMP Implementation
V4.0
Student Notebook
Uempty
3-15
Student Notebook
Node
1
varyonvg
ODM
Node
2
ODM
varyonvg
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
AU548.0
Notes:
What is a ghost disk?
During the AIX boot sequence, the configuration manager (cfgmgr) accesses all the
shared disks (and all other disks and devices). Each time it accesses a physical volume
at a particular hardware address, it tries to determine if the physical volume is the same
actual physical volume that was last seen at the particular hardware address. It does
this by attempting to read the physical volumes ID (PVID) from the disk. This operation
fails if the disk is currently reserved to another node. Consequently, the configuration
manager is not sure if the physical volume is the one it expects or is a different physical
volume. In order to be safe, it assumes that it is a different physical volume and assigns
it a temporary hdisk name. This temporary hdisk name is called a ghost disk.
When the volume group is eventually brought online by Cluster Services, the question
of whether each physical volume is the expected physical volume is resolved. If it is,
then the ghost disk is deleted. If it isnt, then the ghost disk remains. Whether or not the
online of the volume group ultimately succeeds depends on whether or not the LVM is
3-16 HACMP Implementation
V4.0
Student Notebook
Uempty
find enough of the volume groups physical volumes (and other factors such as whether
quorum checking is enabled on the volume group).
3-17
Student Notebook
Node
1 passive varyon
active varyon
Node
2
ODM
ODM
ODM
active varyon
passive varyon
AU548.0
Notes:
Introduction
HACMP 5.x supports the new style of shared storage protection, which relies on AIXs
RSCT component to coordinate the ownership of shared storage when using enhanced
concurrent volume groups in non-concurrent mode.
V4.0
Student Notebook
Uempty
the active state). The LVM on each node prohibits updates to the volume groups data
unless the node has the volume group varied on in the active state.
It is the responsibility of the RSCT component to ensure that each volume group is
varied online in the active state on not more than one node. Since this mechanism does
not rely on any disk reservation mechanism, it is compatible with all disk technologies
supported by HACMP.
3-19
Student Notebook
AU548.0
Notes:
Introduction
Defining an enhanced concurrent volume group allows the LVM to use RSCT to
manage varyonvg and varyoffvg processing.
Concurrent access
In a concurrent access environment, all the nodes will varyon the volume group.
V4.0
Student Notebook
Uempty
3-21
Student Notebook
AU548.0
Notes:
Active varyon
If using enhanced concurrent volume groups in a non-concurrent access environment,
only one node will varyon the VG in active mode, allowing full access.
Passive varyon
Other nodes will varyon the VG in passive mode. In passive mode, only very limited
operations are allowed on the volume group. They are:
- Reading volume group configuration information (for example, lsvg)
- Reading logical volume configuration information (for example, lslv)
Most operations are prohibited. They are
- Any operations on filesystems and logical volumes (for example, mounts, open,
create, modify, delete, and so forth)
3-22 HACMP Implementation
V4.0
Student Notebook
Uempty
3-23
Student Notebook
ecmvg
ecmvg
VG IDENTIFIER:
0009314700004c00000000f
active
PP SIZE:
8 MB
read/write
TOTAL PPs:
537 (4296 MB)
...
...
...
Enhanced-Capable
Auto-Concurrent: Disabled
On passive node:
toronto # lsvg
VOLUME GROUP:
e2eaa2d6d
VG STATE:
VG PERMISSION:
...
Concurrent:
ecmvg
ecmvg
VG IDENTIFIER: 0009314700004c00000000f
active
PP SIZE:
8 MB
passive-only TOTAL PPs:
537 (4296 MB)
...
...
Enhanced-Capable
Auto-Concurrent: Disabled
AU548.0
Notes:
Introduction
The VG PERMISSION field in the output of lsvg shows if a volume group is varied on in
active or passive mode.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Although the details of the processing of enhanced concurrent mode volume groups are
largely beyond the scope of this class, it is very useful to understand the basics. The Group
Services component of RSCT is used to control the ownership of the volume group, thus
the name were using, RSCT-based.
Group Services
Group Services is a component that allows nodes to participate in groups to control
resources of common interest where each node has a vote in how the resource is
controlled. HACMP belongs to two Group Services groups for the control of the cluster
related resources amongst all the nodes in the cluster (but thats another story for another
class, AU600 - HACMP Internals). When a node would like to effect a change on a
resource, it proposes that change to the group via a protocol (communications with the
other Group Services daemons). The Group Services daemon for HACMP is grpsvcs.
gsclvmd
3-25
Student Notebook
The daemon that controls this group membership is gsclvmd. It is important to understand
that this daemon depends on the Group Services being active and that Group Services is
activated when Cluster Services is started. That should reinforce the point that ECMVGs
are to be used with HACMP only!
Warning
Filesystem changes are not handled by this process. This is where C-SPOC is necessary.
V4.0
Student Notebook
Uempty
rt1s1vlp2 # ps -ef | grep $(lsvg appAvg | grep IDENTIFIER | cut -d":" -f3)
root 294954 405668
0 14:03:15
- 0:00 /usr/sbin/gsclvmd -r 30 -i 300 -t 50 -c
00c0288e00004c0000000116b0b5cf7a -v 0
PID
405668
Active VGs # 1
vgid
00c0288e00004c0000000116b0b5cf7a
Status
active
Match VGID
pid
294954
VGSA Group
VGDA Group
AU548.0
Notes:
Now that you have an idea how EMCVGs work, you need to know how to see whats going
on.
3-27
Student Notebook
passive
varyon
httpvg
active
varyon
ODM
Node
1
ODM
active
varyon
passive
varyon
dbvg
httpvg
passive
varyon
passive
varyon
ODM
active
varyon
dbvg
httpvg
A
passive
varyon
passive
varyon
ODM
dbvg
Node
2
ODM
active
varyon
1. A decision is made to
move httpvg from the
right node to the left.
Node
2
ODM
active
varyon
Node
1
Node
2
passive
varyon
AU548.0
Notes:
Manual movement of RSCT-based volume groups
The fast disk takeover mechanism handles a manual VG takeover by first releasing the
active varyon state of the volume group on the node that is giving up the volume group.
It then sets the active varyon state on the node that is taking over the volume group.
The coordination of these operations is managed by HACMP 5.x and AIX RSCT.
V4.0
Student Notebook
Uempty
passive
varyon
httpvg
B
active
varyon
ODM
Node
1
ODM
active
varyon
passive
varyon
dbvg
passive
varyon
Node
2
httpvg
B
ODM
ODM
active
varyon
Node
1
Node
2
active
varyon
dbvg
httpvg
B
passive
varyon
passive
varyon
ODM
dbvg
Node
2
ODM
active
varyon
passive
varyon
AU548.0
Notes:
Fast disk takeover in a failure scenario
A node has failed. When the remaining node (or nodes) realize that the node has failed,
the takeover node sets the volume groups varyon state to be active.
There is no need to break disk reservations as no disk reservations are in place. The
only action required is that the takeover node ask its local LVM to mark the volume
groups varyon state as active.
If Topology Services fail (that is, no communication between the nodes), then group
services fail and it is not possible to activate the volume group. This makes it very safe
to use. It is recommended, however, to use enhanced volume groups only on systems
running HACMP 5.x.
3-29
Student Notebook
AU548.0
Notes:
Considerations
As with any technology, the implications of using fast disk takeover must be properly
understood if the full benefits are to be experienced.
Note: If RSCT is not running, it is possible (although it takes some work) to manually
varyon an enhanced concurrent volume group to active mode, while it is varied on in active
mode on another node. Although this is possible, it is an unlikely occurrence. This small
risk can easily be avoided by never varying on your shared volume groups manually.
Requirements
Fast disk takeover is used only if all of the requirements listed previously have been
met.
Because RSCT is independent of disk technology, all disks supported by HACMP can
be used in an enhanced concurrent mode volume group.
3-30 HACMP Implementation
V4.0
Student Notebook
Uempty
2. True or False?
3. True or False?
AU548.0
Notes:
3-31
Student Notebook
V4.0
Student Notebook
Uempty
3-33
Student Notebook
AU548.0
Notes:
V4.0
Student Notebook
Uempty
RAID 5
> In the disk subsystem
AU548.0
Notes:
Compatibility
Flashes can be found at
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/Web/Flashes
Hints, Tips, and Technotes can be found at
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/Web/Technotes
HACMP Release Notes
Shipped with the product
Redundancy
Your goal is to eliminate single points of failure. When considering this for storage, it
involves defining more than one disk drive for every piece of data on the storage
subsystem and multiple paths to get to the data from the server. This is referred to as
Copyright IBM Corp. 1998, 2008
3-35
Student Notebook
data redundancy. HACMP does not provide data redundancy. In all likelihood, you will
be choosing a storage subsystem to provide the data redundancy. You might choose a
JBOD (Just a Bunch Of Disks) storage device, in which case you will have to provide
the redundancy in AIX. Multiple paths to get to the data from the server is accomplished
through multi-pathing software. That software must be checked for compatibility with
HACMP.
HACMP is oblivious to the storage device and redundancy method chosen. Although
not in the scope of this class, the selected storage subsystem will be affected by the
factors listed as follows (among others). The selected storage subsystem will then
determine what you will look for in terms of compatibility with the chosen HACMP
version and features.
- Data access performance requirements
- Capacity
- Support for multi-pathing
- Price
V4.0
Student Notebook
Uempty
HBA
MPIO
hdisk0
vhost0
no_reserve
VIOS 2
HBA
MPIO
HBA
hdisk0
HACMP Node1
Hypervisor
HBA
vscsi0
MPIO
hdisk0
vscsi1
sharedvg
vhost0
hdisk0
FRAME 1
Stg
Dev
VIOS 1
HBA
MPIO
hdisk0
vhost0
no_reserve
VIOS 2
HBA
MPIO
HBA
hdisk0
HACMP Node2
Hypervisor
HBA
vscsi0
MPIO
vscsi1
hdisk0
sharedvg
vhost0
AU548.0
Notes:
Overview
This type of configuration is becoming prevalent with the adoption of the Virtualization
capabilities of the POWER5 and later architecture. A full discussion of the
implementation of this configuration is beyond the scope of the class. The intent is to
indicate that this is a supported configuration, some of the terms to learn, requirements,
and a configuration overview. Consult the IBM Sales Manual and IBM Support (and
anyone else you can find who will talk to you about this from an experienced standpoint)
for the latest requirements and considerations.
Legend
Stg Dev - Storage Subsystem providing access to disks, like a DS8300, DS4000, EMC,
HDS, SSA, and so on.
3-37
Student Notebook
VIOS - Virtual I/O Server, the special LPAR on a Power5/6 systems that provides
virtualized storage (and networking) devices for use by client LPARs
HBA - Host Bus Adapter, also known as Fibre Channel Adapter, this is the connection
to the SAN, giving the VIOS access to storage in the SAN (LUNs).
MPIO - Multipath I/O, built into AIX since V5.1, creates path devices for each instance of
a disk/LUN that is recognized by AIX, presenting only a single hdisk device from these
multiple paths.
vhost0 - Virtual SCSI (server) adapter on the Virtual IO Server that provides the client
LPARs with access to virtual SCSI disks.
vscsi0 - Virtual SCSI (client) adapter on the client LPAR that provides the client access
to the VIOSs Virtual SCSI (server) adapter and therefore access to the virtual SCSI
disks.
Hypervisor - The Power5/6 component that manages access between the vhost and
vscsi adapters.
Minimum requirements
As of the writing of this version of the course, the minimum requirements for HACMP
with Virtual SCSI (VSCSI) and Virtual LAN (VLAN) on POWER5/6 models were:
HACMP supports the IBM VIO Server V1.4
August 10, 2007
IBM* High Availability Cluster Multiprocessing (HACMP*) for AIX 5L*, Versions 5.2, 5.3,
and 5.4 extends support to include IBM Virtual I/O Server (VIO Server) Version 1.4
virtual SCSI and virtual Ethernet devices on all HACMP supported IBM POWER5* and
POWER6* servers along with IBM BladeCenter JS21. This includes HACMP nodes
running in LPARs on supported IBM System i5* processors.
V4.0
Student Notebook
Uempty
AIX TL3
___________________________________________________________________
HACMP for AIX 5L V5.4HACMP #IY87247
AIX TL3
___________________________________________________________________
HACMP supports the IBM VIO Server Versions 1.2 and 1.3
March 8, 2007
IBM* High Availability Cluster Multiprocessing (HACMP*) for AIX 5L*, Version 5.2, V5.3,
and V5.4 extends support to include IBM Virtual I/O Server (VIO Server) Version 1.2
and 1.3 on all HACMP supported IBM POWER5* servers. This includes HACMP nodes
running in LPARs on supported IBM System* i5 processors.
3-39
Student Notebook
V4.0
Student Notebook
Uempty
If the VIO Server has only a single physical interface on a network, then a failure of that
physical interface will be detected by HACMP. However, that failure will isolate the node
from the network.
Although some of these might be viewed as configuration restrictions, many are direct
consequences of I/O Virtualization.
Service can be obtained from the IBM Electronic Fix Distribution site at:
http://www-03.ibm.com/servers/eserver/support/unixservers/aixfixes.html
All the details on requirements and specifications are in this Flash:
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10390
Configuration overview
Configuration is mostly performed on the VIOS and Hardware Management Console.
The use of MPIO at the AIX level is also essential to ensuring data availability if access
to a VIOS is lost. Ensure that you reactivate any path in MPIO that was lost after it is
recovered to avoid total loss of access to data on a subsequent path failure. The
HACMP consideration, in addition to the correct software levels as outlined previously is
that enhanced concurrent volume groups are used in this configuration. Otherwise, this
is just another volume group to be managed in a resource group, to the cluster
manager.
On Storage device
Map LUNs to the two corresponding VIO servers
On Hardware Management Console
Define Mappings (vhost & vscsi)
On VIO Server 1
Set no_reserve attribute
$ chdev -dev <hdisk#> -attr reserve_policy=no_reserve
Export the LUNs out to each client
$ mkvdev vdev hdisk# -vadapter vhost0
On VIO Server 2
Set no_reserve attribute
$ chdev -dev <hdisk#> -attr reserve_policy=no_reserve
Export the LUNs out to each client
$ mkvdev vdev hdisk# -vadapter vhost0
On Clients
3-41
Student Notebook
- Configure the MPIO Default PCM to conduct health checks down all paths and
recover when a path is restored. This requires a reboot to take affect.
# chdev -l <hdisk#> -a hcheck_interval=20 -a hcheck_mode=nonactive
-P
- Create the shared volume group as Enhanced Concurrent VG on first Client
(bos.clvm.enh required).
- Varyoffvg on Client 1.
- Import VG onto Client 2.
- Define to HACMP as a shared resource in a resource group.
References
Courses that address this configuration:
- AU620, HACMP System Administration III: Virtualization and Disaster Recovery
- AU730, System p LPAR and Virtualization I: Planning and Configuration
- AU780, System p LPAR and Virtualization II: Implementing Advanced
Configurations
Redbooks (www.redbooks.ibm.com):
- REDP-4027-00: HACMP 5.3, Dynamic LPAR and Virtualization
Provides details later in the document on HACMP and Virtualization along with
failure scenarios in the VIO infrastructure and performance considerations
- SG24-7940-02: Advanced POWER Virtualization on IBM System p5 Servers:
Introduction and Configuration - Chapter 4
- REDP-4194: IBM System p Advanced POWER Virtualization Best Practices
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Overview
Use the pointers already provided to access IBM Flashes to determine if the IBM
hardware that youve chosen is supported with HACMP and the HACMP requirements.
Also read the Release Notes provided with the HACMP product for the latest
information on requirements.
SDD
With most IBM SAN Storage devices, the multi-pathing software will be the Subsystem
Device Driver (SDD). It is supported with HACMP (with appropriate PTFs).
To use C-SPOC with VPATH disks, SDD 1.3.1.3, or later, is required.
For levels and maintenance, check:
http://www-1.ibm.com/support/docview.wss?rs=540&context=ST52G7&uid=ssg1S
4000065&loc=en_US&cs=utf-8&lang=en
Copyright IBM Corp. 1998, 2008
3-43
Student Notebook
AU548.0
Notes:
IBMs statement on non-IBM storage requirements with HACMP
This FAQ states the HACMP position with respect to non-IBM storage devices.
Question: Does HACMP support EMC or Hitachi storage subsystems when connected
to pSeries servers?
Answer: The storage subsystems supported by HACMP are those documented in the
Sales Manual. New additions are announced via Flash. Current information can be
retrieved from the online Sales Manual. HACMP supports only those IBM devices that
have passed IBM qualification efforts, and for which IBM development and service are
prepared to provide support. There is a group, associated with development, that tests
non-IBM storage subsystems for attachment to AIX systems and HACMP. Also,
cooperative service agreements are in place with certain non-IBM storage vendors.
HACMP also provides a supported interface, documented in the HACMP Planning and
Installation Guide, which allows any storage subsystem to be described in terms of a
standard set of operations: This allows for the invocation of user-provided methods to
3-44 HACMP Implementation
V4.0
Student Notebook
Uempty
accommodate device specific behaviors and operations that might not be automatically
supported by HACMP. If a client has an HACMP cluster containing storage hardware
other than that supported by HACMP, and they report a problem, IBM Service will
address the problem as follows:
If the problem is unrelated to that hardware, it will be addressed the same as any other
problem.
If the problem is related to that hardware, and the hardware is covered by a cooperative
service agreement with the storage vendor, the problem will be forwarded to the storage
vendor.
If the problem is related to hardware for which no cooperative service agreement is in
place, the client will be asked to refer the problem to the hardware manufacturer.
Determining compatibility
When contacting both IBM and non-IBM sources for information, indicate your intent to
configure the non-IBM storage device with HACMP and request driver, patch,
multi-pathing software and microcode requirements, and experiences with this
combination.
Also read the Release Notes provided with the HACMP product for the latest
information on requirements.
EMC
When using the EMC URL listed above to gather EMC information, here is the path to
take to find the HACMP compatibility information.
- Navigate to
http://www.emc.com/interoperability/matrices/EMCSupportMatrix.pdf
- Search for HACMP.
- You will get many hits; look in the sections that apply to your storage devices.
- Then look for the HACMP version that you are installing.
- Finally, look for the device driver, PowerPath, and AIX patch information for your
configuration.
3-45
Student Notebook
Maximum 25m
Host
System
SCSI
Controller
Host
System
SCSI
6 Controller
SCSI 4
Module
SCSI 3
Module
SCSI 2
Module
SCSI 1
Module
Disk
Drive
Disk
Drive
Disk
Drive
Disk
Drive
AU548.0
Notes:
SCSI termination
In HACMP environments, SCSI terminators must be external so that the bus is still
terminated after a failed system unit has been removed.
V4.0
Student Notebook
Uempty
3-47
Student Notebook
There are devices that can be inserted into the middle of SCSI buses, which claim to
allow the bus to be severed at the point of insertion. Unless you can get IBM to
specifically state that they support such a device, then you should not use it.
V4.0
Student Notebook
Uempty
000206238a9e74d7
00020624ef3fafcc
00206983880a1580
00206983880a1ed7
00206983880a31a7
rootvg
None
None
None
None
Node 1
ODM
C
AU548.0
Notes:
PVIDs and their use in AIX
For AIX to use a disk (LUN), it requires that the disk (LUN) be assigned a unique
physical volume ID (PVID). This is stored in the ODM and on the disk (LUN), and linked
to a logical construct in AIX called an hdisk. hdisks are numbered sequentially as
discovered by the configuration manager (cfgmgr). Each AIX system that is sharing a
volume group will need to have access to the same disks (LUNs). This is either done
through zoning and masking in the SAN or via twin-tail cabling for non-SAN
implementations.
If the zoning, masking, and cabling is done correctly, each system will see the same
disks (LUNs).
If a disk (LUN) has no PVID, it is assigned when the disk (LUN) is defined to a volume
group or manually by a user via the chdev command. If a disk (LUN) has a PVID
assigned, it will be recognized by AIX when a cfgmgr runs (manually or at system boot)
and stored in the ODM. Again, for systems to share access to a volume group, all the
Copyright IBM Corp. 1998, 2008
3-49
Student Notebook
disks (LUN) that are in the volume group must be defined to each system with common
PVIDs.
Using the previous command on each system to determine which systems see which
PVIDs and the volume group affinity is the first step to ensuring that all systems that will
share a volume group have the necessary disks (LUNs) defined. The example shows
that the system sees four disks (LUNs) that have PVIDs assigned, but none of them are
in a volume group yet. The next logical step would be to check the other systems for
common PVIDs. All PVIDs that are found in common would be the PVIDs (and
therefore hdisks) that could be used to create shared volume groups. C-SPOC uses
this method to list the PVIDs that can be used to create a cluster-wide shared volume
group. If C-SPOC finds no common PVIDs across the selected systems for a shared
volume group, no PVIDs are listed. Knowing the PVID-to-hdisk relationship on all the
cluster nodes is therefore very important when creating a shared volume group. This is
true whether using C-SPOC or not.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Introduction
HACMP enables you to use either physical storage disks manufactured by IBM or by an
Original Equipment Manufacturer (OEM) as part of a highly available infrastructure.
Depending on the type of OEM disk, custom methods enable you (or an OEM disk
vendor) to either:
- Tell HACMP that an unknown disk should be treated the same way as a known and
supported disk type, or
- Specify the custom methods that provide the low-level disk processing functions
supported by HACMP for that particular disk type
3-51
Student Notebook
following three files can be edited to perform this configuration. (There is no SMIT menu
to edit these files.)
/etc/cluster/disktype.lst
This file is referenced by HACMP during disk takeover.
You can use this file to tell HACMP that it can process a particular type of disk the same
way it processes a disk type that it supports. The file contains a series of lines of the
following form:
<PdDvLn field of the hdisk><tab><supported disk type>
To determine the value of the PdDvLn field for a particular hdisk, enter the following
command:
# lsdev -Cc disk -l <hdisk name> -F PdDvLn
The known and supported disk types are:
Disk Name in HACMP
SCSIDISK
SSA
FCPARRAY
ARRAY
FSCSI
Disk Type
SCSI -2 Disk
IBM Serial Storage Architecture
Fibre Attached Disk Array
SCSI Disk Array
Fibre Attached SCSI Disk
For example, to have a disk whose PdDvLn field was disk/fcal/HAL9000 be treated
the same as IBM fibre SCSI disks, a line would be added that read:
disk/fcal/HAL9000
FSCSI
9000
V4.0
Student Notebook
Uempty
depending on whether vendor or vendor plus product match was desired. Note the
use of padding of Vendor ID to 8 characters.
A sample /etc/cluster/lunreset.lst file, which contains comments, is provided.
/etc/cluster/conraid.dat
This file is referenced by HACMP during varyon of a concurrent volume group.
You can use this file to tell HACMP that a particular disk is a RAID disk that can be used
in classical concurrent mode. The file contains a list of disk types, one disk type per line.
The value of the Disk Type field for a particular hdisk is returned by the following
command:
# lsdev -Cc disk -l <hdisk name> -F type
Note: This file only applies to classical concurrent volume groups. Thus this file has no
effect in AIX V5.3 or greater, which does not support classical concurrent VGs.
HACMP does not include a sample conraid.dat file. The file is referenced by the
/usr/sbin/cluster/events/utils/cl_raid_vg script, which does include some
comments.
Additional considerations
The previously described files in /etc/cluster are not modified by HACMP after they
have been configured and are not removed if the product is uninstalled. This ensures
that customized modifications are unaffected by the changes in HACMP. By default, the
files initially contain comments explaining their format and usage.
Remember that the entries in these files are classified by disk type, not by the number
of disks of the same type. If several disks of the same type are attached to a cluster,
there should be only one file entry for that disk type.
Finally, unlike other configuration information, HACMP does not automatically
propagate these files across nodes in a cluster. It is your responsibility to ensure that
these files contain the appropriate content on all cluster nodes. You can use the
HACMP File Collections facility to propagate this information to all cluster nodes.
3-53
Student Notebook
HACMP enables you to specify any of its own methods for each step in disk processing,
or to use a customized method, which you define.
Using SMIT, you can perform the following functions for OEM disks:
- Add Custom Disk Methods
- Change/Show Custom Disk Methods
- Remove Custom Disk Methods
More information
For detailed information about configuring OEM disks for use with HACMP, see:
SC23-5209-01
V4.0
Student Notebook
Uempty
SCSI
SSA
FC
All of the above
2. True or False?
3. True or False?
4. True or False?
No special considerations are required when using SAN based storage units
(DS8000, ESS, EMC HDS, and so forth).
5. True or False?
hdisk numbers must map to the same PVIDs across an entire HACMP
cluster.
Copyright IBM Corporation 2008
AU548.0
Notes:
3-55
Student Notebook
V4.0
Student Notebook
Uempty
3-57
Student Notebook
Figure 3-30. Topic 3 objectives: Shared storage from the AIX perspective
AU548.0
Notes:
This topic discusses shared storage from the AIX perspective.
V4.0
Student Notebook
Uempty
Physical
Partitions
Logical
Partitions
PVID
Physical
Volumes
hdisk0
Logical
Volume
PVID
hdisk1
Volume
Group
AU548.0
Notes:
LVM review
The set of operating system commands, library subroutines, and other tools that allow
the user to establish and control logical volume storage is called the logical volume
manager.
LVM controls disk resources by mapping data between a simple and flexible logical
view of storage space and the actual physical disks. The logical volume manager does
this by using a layer of device driver code that runs above the traditional physical device
drivers.
Logical volumes
This logical view of the disk storage, which is called a logical volume (LV), is provided to
applications and is independent of the underlying physical disk structure. The LV is
made up of logical partitions.
Copyright IBM Corp. 1998, 2008
3-59
Student Notebook
Physical volumes
Each individual disk drive is called a physical volume (PV). It has a physical volume ID
(PVID) associated with it and an AIX name, usually /dev/hdiskx (where x is a
unique integer on the system). Every physical volume in use belongs to a volume group
(VG) unless it is being used as a raw storage device or a readily available spare (often
called a hot spare). Each physical volume is divided into physical partitions (PPs) of a
fixed size for that physical volume. A logical partition is mapped to one or more physical
partitions.
Volume groups
Physical volumes and their associated logical volumes are grouped into volume group.
Operating system files are stored in the rootvg volume group. Application data are
usually stored in one or more additional volume groups.
V4.0
Student Notebook
Uempty
LVM relationships
LVM manages the components of the disk subsystem. Applications talk to the
disks through LVM.
This example shows an application writing to a filesystem, which has its LVs
mirrored in a volume group physically residing on separate hdisks.
LVM
Volume Group
Physical
Partitions
Logical
Partitions
write to
/filesystem
Mirrored
Logical
Volume
Application
AU548.0
Notes:
LVM relationships
An application writes to a file system. A file system provides the directory structure and
is used to map the application data to logical partitions of a logical volume. Because an
LVM exists, the application is isolated from the physical disks. The LVM can be
configured to map a logical partition to up to three physical partitions and have each
physical partition (copy) reside on a different disk.
3-61
Student Notebook
ODM-LVM relationships
LVM information is kept in two places:
ODM (Object Data Manager)
VGDA (Volume Group Descriptor Area)
NOTE: This applies to changes made to the LVM constructs, not the data within
AU548.0
Notes:
Before going too far, its important to understand that the LVM data weve discussed is kept
in both the VGDA of all the disks in the volume group AND in the ODM of the system
making changes to the volume group (or creating it). This creates a rather obvious
problem. How do you keep the ODM up-to-date in every system other than the system that
is making a change to the volume group.
Understand that this is only a consideration when changes are made to the LVM constructs
themselves; for example, adding a filesystem/logical volume, increasing the size of a
logical volume, adding/removing a disk, and so forth.
V4.0
Student Notebook
Uempty
Node
2
Disk
VGDA
ODM
ODM
#4
#3
varyoffvg
cfgmgr
importvg
chvg
#1
#2
mkvg
unmount
chvg
varyoffvg
mklv (log)
logform
mklv (data)
crfs
#5
Start Cluster Services
Copyright IBM Corporation 2008
AU548.0
Notes:
Introduction
The steps to add a shared volume groups are:
1. Ensure consistent PVIDs on all nodes where VG to be defined
2. Create a new VG and its contents
3. Varyoff VG on Node1
4. Import VG on Node2 and set VG characteristics correctly
5. Varyoff VG on Node2
6. Start Cluster Services
Note that the slide presents only a high-level view of the commands required to perform
these steps. More details are provided as follows.
3-63
Student Notebook
0. Ensure common PVIDs across all nodes that will share volume group
As discussed earlier, HACMP has no requirement that hdisk names on all the nodes are
consistent, but that all the nodes have access to the same disks and have discovered
the PVIDs.
a. Ensure disks are cabled/zoned/masked so that the disks will be seen by both nodes.
b. Add the shared disk(s) to AIX on the primary node (Node1 in the example):
cfgmgr
c. Assign a PVID to the disk(s)
chdev -a pv=yes -l disk_name
where disk_name is hdisk#, hdiskpower# or vpath#.
d. Add the disks to AIX on the secondary node (Node2)
cfgmgr
e. Using the PVIDs, verify that the necessary PVIDs are seen on both nodes. If not,
correct them.
lspv
V4.0
Student Notebook
Uempty
C-SPOC
Fortunately, there is an easier way.
These steps will be done automatically if the cluster is active and C-SPOC is used.
Otherwise, you can use the commands listed here in the notes.
Unfortunately, we are not looking at the easier way until we get to the C-SPOC unit.
3-65
Student Notebook
LVM mirroring
As mentioned in an earlier topic, HACMP does not provide data redundancy
AIX LVM mirroring is a method that can be used to provide data redundancy
LVM mirroring has some key advantages over other types of mirroring:
Up to three-way mirroring of all logical volume types, including concurrent logical volumes,
sysdumpdev, paging space, and raw logical volumes
Disk type and disk bus independence
Optional parameters for maximizing speed or reliability
Changes to most LVM parameters can be done while the affected components are in use
The splitlvcopy command can be used to perform online backups
Volume Group
LVM
Physical
Partitions
Logical
Partitions
write to
/filesystem
Mirrored
Logical
Volume
Application
AU548.0
Notes:
Introduction
Reliable storage is essential for a highly available cluster. LVM mirroring is one option to
achieve this. Other options are a hardware RAID disk array configured in RAID-5 mode
or some other solution which provides sufficient redundancy such as an external
storage subsystem like the ESS (DS6000/DS8000), EMC, and so forth.
LVM mirroring
Some of the features of LVM mirroring are:
- Data can be mirrored on three disks rather than having just two copies of data. This
provides higher availability in the case of multiple failures, but does require more
disks for the three copies.
- The disks used in the physical volumes could be of mixed attachment types.
V4.0
Student Notebook
Uempty
- Instead of entire disks, individual logical volumes are mirrored. This provides
somewhat more flexibility in how the mirrors are organized. It also allows for an odd
number of disks to be used and provides protection for disk failures when more than
one disk is used.
- The disks can be configured so that mirrored pairs are in separate sites or in
different power domains. In this case, after a total power failure on one site,
operations can continue using the disks on the other site that still has power. No
information is displayed on the physical location of each disk when mirrored logical
volumes are being created, unlike when creating RAID 1 or RAID 0+1 arrays; so
allocating disks on different sites requires considerable care and attention.
- Mirrored pairs can be on different adapters.
- Read performance is good for short length operations as data can be read from
either of two disks; so the one with the shortest queue of commands can be used.
Write performance requires a write to two disks.
- Extra mirrored copies can be created and then split off for backup purposes.
- Data can be striped across several mirrored disks, an approach that avoids hot
spots caused by excessive activity on a few disks by distributing the I/O operations
across all the member disks.
- There are parameters, such as Mirror Write Consistency, Scheduling Policy, and
Enable Write Verify, which can help maximize performance and reliability.
3-67
Student Notebook
Description
Options
create a jfs2log lv
"sharedlvlog"
create a filesystem on a
previously created lv
AU548.0
Notes:
Introduction
This visual describes a procedure for creating a shared volume group and a mirrored
file system. There is an easier-to-use method provided by an HACMP facility called
C-SPOC, which is discussed later in the course. The C-SPOC method cannot be used
until the HACMP clusters topology and at least one resource group have been
configured.
The procedure described in the visual permits the creation of shared file systems before
performing any HACMP related configuration (an approach favored by some cluster
configurators).
It is also valuable to notice that unique names are being used for all of the LVM
components, including JFS Log logical volumes. Pay very close attention to that when
creating LVM components manually. If a JFS Log is not specified when creating a
filesystem, one will be created (if one doesnt exist, that is) with a system generated
V4.0
Student Notebook
Uempty
name. This could conflict with one that already exists on a system that will be sharing
this volume group.
Detailed procedure
Here are the steps in somewhat more detail:
a. Use the smit mkvg fastpath to create the volume group.
b. Make sure that the volume group is created with the Activate volume group
AUTOMATICALLY at system restart parameter set to no (or use smit chvg to
set it to no). This gives HACMP control over when the volume group is brought
online. It is also necessary to prevent, for example, a backup node from attempting
to online the volume group at a point in time when it is already online on a primary
node.
c. Use the smit mklv fastpath to create a logical volume for the jfs2log with the
parameters indicated in the figure above (make sure that you specify a type of
jfs2log or AIX ignores the logical volume and creates a new one, which has a
system generated name, when you create file system below).
d. Use the logform -V jfs2log <lvname>command to initialize a logical volume for
use as a JFS2 log device.
e. Use the smit mklv fastpath again to create a logical volume for the file system with
the parameters indicated in the figure above.
f. Use the smit crjfs2lvstd fastpath to create a JFS file system in the now existing
logical volume.
Verify by mounting the file system and using the lsvg command. Notice that if copies
were set to 2, then the number for PPs should be twice the number for LPs and that if
you specified separate physical volumes then the values for PVs should be 2 (the
number of copies).
3-69
Student Notebook
When mirroring in AIX, quorum checking is an issue because losing access to 50% of the
disks in a volume group takes the volume group offline
How can you lose access to 50% of the disks?
Logical Volumes are mirrored across two things
The two things can be two disk enclosures or two sites
One of the two things goes away
VG status
Quorum checking
Enabled for
volume group
(# of VGDAs required)
Running (to
stay running)
To
bring online
(varyonvg)
>50% VGDAs
>1
VGDAs
100% VGDAs
>50% VGDAs
or
Forced Varyon set
AU548.0
Notes:
Introduction
If you plan to mirror your data at the AIX level to provide redundancy, you will need to
consider AIX quorum checking on a volume group. If you arent mirroring your data at
the AIX level, quorum isnt an issue.
Quorum
Quorum is the check used by the LVM at the volume group level to resolve possible
data conflicts and to prevent data corruption. Quorum is a method by which >50% of
VGDAs must be available in a volume group before any LVM actions can continue.
Note: For a VG with three or more disks, there is one copy of the VGDA on each disk.
For a one disk VG, there are two copies of the VGDA. For a 2-disk VG, the first disk has
two copies and the second has one copy of the VGDA. The VGDA is identical for all
disks in the VG.
3-70 HACMP Implementation
V4.0
Student Notebook
Uempty
Quorum is especially important in an HA cluster. If LVM can varyon a volume group with
half or less of the disks, it might be possible for two nodes to varyon the same VG at the
same time, using different subsets of the disks in the VG. This is a very bad situation
which we will discuss in the next visual.
Normally, LVM verifies quorum when the VG is varied on and continuously while the VG
is varied on.
Label
LVM_SA_QUORCLOSE
Type CL
UNKN H
Description
QUORUM LOST, VOLUME GROUP CLOSING
3-71
Student Notebook
group offline due to a loss of quorum for the volume group on the node, HACMP
selectively moves the resource group to another node.
You can change this default behavior by customizing resource recovery to use a notify
method instead of fallover. For more information, see Chapter 3: Configuring HACMP
Cluster Topology and Resources (Extended) in the Administration Guide.
Note: HACMP launches selective fallover and moves the affected resource group only
in the case of the LVM_SA_QUORCLOSE error. This error can occur if you use mirrored
volume groups with quorum enabled. However, other types of volume group failure
errors could occur. HACMP does not react to any other type of volume group errors
automatically. In these cases, you still need to configure customized error notification
methods, or use AIX Automatic Error Notification methods to react to volume group
failures.
V4.0
Student Notebook
Uempty
Overall considerations
Distribute hard disks across more than one bus
Use different power sources
AU548.0
Notes:
Introduction
Eliminating quorum issues is done either by mirroring with quorum disabled, or by not
mirroring at the AIX level.
3-73
Student Notebook
- If you are mirrored across two disk subsystems, consider a quorum buster disk to
prevent loss if quorum if you lose access to one subsystem. This is discussed in the
later in the notes.
Distribute hard disks across more than one bus
Use multipathing software and two Fibre Channel adapters.
Use three adapters per node in SCSI.
Use two adapters per node, per loop in SSA.
Use different power sources
Connect each power supply in the storage device to a different power source.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Introduction
If you decide to mirror at the AIX level and to leave quorum checking on, you will want to
have HACMP handle the loss of access to a volume group if half the disks are lost. Be
sure you understand what youre deciding to do, though. If you allow HACMP to handle
the loss of access to the volume group, this means that the loss of half the disks (only
one of your two copies of the data) will result in the users loss of access to the
application until it can be taken by another cluster node. Youve purchased the
additional hardware and set up the mirroring precisely to avoid downtime if you lose
access to part of the hardware, but this strategy will result in downtime. You make the
call (see disabling quorum in the previous visual).
varyonvg -f
AIX provides the ability to varyon a volume group if a quorum of disks is not available.
This is called forced varyon. The varyonvg -f command allows a volume group to be
Copyright IBM Corp. 1998, 2008
3-75
Student Notebook
made active that does not currently have a quorum of available disks. All disks that
cannot be brought to an active state will be put in a removed state. At least one disk
must be available for use in the volume group.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Be careful when using forced varyon
Failure to follow each and every one of these recommendations could result in either
data divergence or inconsistent VGDAs. Either problem can be very difficult if not
impossible to resolve in any sort of satisfactory way; so be careful!
More information
Refer to the HACMP for AIX Administration Guide Version 5.4.1 (Chapter 15) and the
HACMP for AIX Planning Guide Version 5.4.1 (Chapter 5) for more information about
forced varyon and quorum issues.
3-77
Student Notebook
AU548.0
Notes:
Unique names
Because your LVM definitions are used on multiple nodes in the cluster, you must make
sure that the names created on one node are not in use on another node. The safest
way to do this is to use C-SPOC. If creating the LVM components outside C-SPOC, you
must explicitly create and name each entity [do not forget to explicitly create, name and
format (using logform) the jfslog logical volumes] with a name known to be unique
across the nodes in the cluster.
V4.0
Student Notebook
Uempty
3-79
Student Notebook
AU548.0
Notes:
Introduction
You can configure OEM volume groups in AIX and use HACMP as an IBM High
Availability solution to manage such volume groups.
Note: Different OEMs can use different terminology to refer to similar constructs. For
example, the Veritas Volume Manage (VxVM) term Disk Group is analogous to the AIX
LVM term Volume Group. We will use the term volume groups to refer to OEM and
Veritas volume groups.
V4.0
Student Notebook
Uempty
automatically. After you add Veritas volume groups to HACMP resource groups, you
can select the methods for the volume groups from the pick lists in HACMP SMIT
menus for OEM volume groups support.
Note: Veritas Foundation Suite is also referred to as Veritas Storage Foundation (VSF).
Additional considerations
The custom volume group processing methods that you specify for a particular OEM
volume group is added to the local node only. This information is not propagated to
other nodes; you must copy this custom volume group processing method to each node
manually. Alternatively, you can use the HACMP File Collections facility to make the
disk, volume, and file system methods available on all nodes.
3-81
Student Notebook
AU548.0
Notes:
Introduction
You can configure OEM file systems in AIX and use HACMP as an IBM High Availability
solution to manage such file systems.
V4.0
Student Notebook
Uempty
Additional considerations
The custom file system processing methods that you specify for a particular OEM file
system is added to the local node only. This information is not propagated to other
nodes; you must copy this custom file system processing method to each node
manually. Alternatively, you can use the HACMP File Collections facility to make the
disk, volume, and filesystem methods available on all nodes.
3-83
Student Notebook
Checkpoint
1.True or False?
Lazy update attempts to keep VGDA constructs in sync between
cluster nodes (reserve/release-based shared storage protection).
3.True or False?
Quorum should always be disabled on shared volume groups.
4.True or False?
Filesystem and logical volume attributes cannot be changed while
the cluster is operational.
5.True or False?
An enhanced concurrent volume group is required for the heartbeat
over disk feature.
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Key points from this unit:
Access to shared storage must be controlled
Non-concurrent (serial) access
Reserve/release-based protection:
Slower and may result in ghost disks
RSCT-based protection (fast disk takeover):
Faster, no ghost disks, and some risk of partitioned cluster in the event of
communication failure
Careful planning is needed for both methods of shared storage protection to
prevent fallover due to communication failures
Concurrent access
Access must be managed by the parallel application
AU548.0
Notes:
3-85
Student Notebook
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1 Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1 Planning Guide
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
4-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
List and explain the requirements for an application to be
supported in an HACMP environment
Describe the HACMP start and stop scripts
Describe the resource group behavior policies supported by
HACMP
Enter the configuration information into the Planning
Worksheets
AU548.0
Notes:
4-2
HACMP Implementation
V4.0
Student Notebook
Uempty
Node 2
Node 3
Shared
Disk
List of Nodes
Policies: Where to run
Resources
Application Server
Service Address
Volume Group
Copyright IBM Corporation 2008
AU548.0
Notes:
Two steps to define an application to HACMP
To have HACMP manage an application, you must do two things:
1. Create an HACMP resource called an application server. The
application server defines a start and a stop script for the
application
2. Create an HACMP resource group. This in turn will require two
steps:
a. The basic resource group definition:
i.
ii. Names which policies to use that will control which node the
application actually runs on.
Copyright IBM Corp. 1998, 2008
4-3
Student Notebook
4-4
HACMP Implementation
V4.0
Student Notebook
Uempty
Application considerations
Automation
No intervention
Dependencies
Using names unique to one node
Other applications
Interference
Conflicts with HACMP
Robustness
Application can withstand problems
Implementation
Other aspects to plan for
AU548.0
Notes:
Introduction
Many applications can be put under the control of HACMP but there are some
considerations that should be taken into account.
Automation
One key requirement for an application to function successfully under HACMP is that
the application must be able to start and stop without any manual intervention. Because
the cluster daemons call the start and stop scripts, there is no option for interaction.
Additionally, upon an HACMP fallover, the recovery process calls the start script to bring
the application online on a standby node. This allows for a fully automated recovery
Other requirements for start and stop scripts will be covered on the next visual.
4-5
Student Notebook
Dependencies
Dependencies to be careful of when coding the scripts include:
Application dependencies:
Dependencies that in the past you had to worry about but now you may not have to:
One application must be up before another one.
Applications must both run on the same node.
These can now be handled by Runtime Dependency options. An overview of these
is given later in this unit.
Interference
An application can execute properly on both the primary and standby nodes. However,
when HACMP is started, a conflict with the application or environment might occur that
prevents HACMP from functioning successfully. Two areas to look out for are using
IPX/SPX Protocol and Manipulating Network Routes.
Robustness
Beyond basic stability, an application under HACMP should meet other robustness
characteristics, such as successful start after hardware failure and survival of real
memory loss. It should also be able to survive the loss of the kernel or processor state.
Implementation
There are several aspects of an application to consider as you plan for implementing it
under HACMP. Consider characteristics, such as time to start, time to restart after
failure, and time to stop. Also consider:
Writing effective scripts.
Consider file storage locations.
Using inittab and cron Table: Inittab is processed before HACMP is started. Cron
table is local to a each node. Time/date should be synchronized.
We will look at writing scripts and data locations in the following visuals.
4-6
HACMP Implementation
V4.0
Student Notebook
Uempty
4-7
Student Notebook
Use assists
AU548.0
Notes:
Introduction:
Application start scripts should not assume the state of the environment; defensive
programming can correct any irregular conditions that occur. Remember that cluster
manager spawns these scripts off a separate job in the background and carries on
processing. The application start scripts must be able to handle an unknown previous
shutdown state.
Items to check
- Environment:
Verify the environment. Are the prerequisite conditions satisfied? These might
include access to a file system, adequate paging space, IP labels and free file
system space. The start script should exit and run a command to notify system
administrators if the requirements are not met.
4-8
HACMP Implementation
V4.0
Student Notebook
Uempty
Using assists
IBM provides a priced feature for HACMP that provides all the code and monitoring for
three applications: WebSphere, DB2, and Oracle Real Application Server (RAC). In
these cases you would not have to write the scripts yourself.
There are also plug-in filesets that provide help for integrating print server, DHCP, and
DNS. These filesets are part of the base HACMP product.
4-9
Student Notebook
AU548.0
Notes:
Introduction
Deciding where data should go should be thought out well. For some data, the answer
is clear. For other cases, it depends. Putting data on shared storage allows for only one
copy but may not be available when needed. Putting data on private storage is subject
to having different copies but upgrades can be done easier.
Private storage
Private storage must be used for the operating system components. It can also be used
for configuration files, license files, and application binaries subject to the trade-offs
mentioned in the introduction.
V4.0
Student Notebook
Uempty
Shared storage
Shared storage must be used for dynamic data, Web server content, data that is
updated by the application and application log files (be sure time is same on the nodes).
Again configuration files, application binaries, and license files could go here subject to
the trade-offs mentioned in the introduction above.
It depends
License files deserves a special mention. If using node locked, then you should use
private storage. In any case, you must learn the license requirements of the application
to make a proper determination.
4-11
Student Notebook
AU548.0
Notes:
Three initial policies
In HACMP, you specify in the resource group definition three policies that control which
node a resource group (application) runs on:
1. Startup (of Cluster Services)
When Cluster Services starts up on a node, each resource group
definition is read to determine if this node is listed, and if so,
whether that resource group has already been started on another
node. If the resource group hasnt been started elsewhere, then
the startup policy is examined to further determine if Cluster
Services should activate the resource group and start the
application.
2. Fallover
V4.0
Student Notebook
Uempty
4-13
Student Notebook
Startup policy
Online on home node only
Online on first available node
Run-time Settling Time may be set
AU548.0
Notes:
Online on home node only
When starting Cluster Services on the nodes, only the Cluster Services on the home
node (first node listed in the resource group definition) will activate the resource group
(and start the application). This policy requires the home node to be available.
V4.0
Student Notebook
Uempty
4-15
Student Notebook
AU548.0
Notes:
Application runs on all available nodes concurrently
If a node belongs to a resource group with this startup policy, when Cluster Services
start on the node, Cluster Services will start the application and make all the resources
mentioned available on this node. In this case, it does not matter if the resource group is
already active on another node so the application ends up being started on all nodes
where Cluster Services are started.
This policy is also referred to as concurrent mode or access.
V4.0
Student Notebook
Uempty
4-17
Student Notebook
Fallover policy
Fallover to next priority node in the list
Fallover using dynamic node priority
Bring offline (on error node)
AU548.0
Notes:
Fallover to next priority node in the list
In the case of fallover, a resource group that is online on only one node at a time follows
the list in the resource groups definition to find the next highest priority node currently
available.
V4.0
Student Notebook
Uempty
4-19
Student Notebook
Fallback policy
Fallback to higher priority node in the list
Can use a run time Delayed Fallback Timer preference
Never fallback
AU548.0
Notes:
Fallback to higher priority node
When HACMP Cluster Services start on a node, HACMP looks to see if there is a
resource group with this node in the list and which is currently active on another node. If
this node is higher in the list than the node the resource group is currently running on
and this policy has been chosen, the resource group is moved and the application is
started on this node.
V4.0
Student Notebook
Uempty
appropriate time. After starting the node, HACMP automatically starts the resource
group fallover at the specified time.
Runtime policies will be covered in more detail in the HACMP Administration II
Administration and Problem Determination course.
4-21
Student Notebook
AU548.0
Notes:
Valid combinations
HACMP enables you to configure only valid combinations of startup, fallover, and
fallback behaviors for resource groups.
V4.0
Student Notebook
Uempty
Node 2
Parent/Child
RG
Child RG
Parent/Child Dependency
One resource group can be the parent of another resource group
Location Dependency
A resource group may be on the same node/site or on a different node/site than
another resource group
AU548.0
Notes:
One resource group can be a parent of another resource group
In HACMP 5.2 and higher, you can have cluster-wide resource group online and offline
dependencies.
In HACMP 5.3 and higher, you can specify resource location dependencies:
Online on same node
Online on different nodes
Online on same site
4-23
Student Notebook
V4.0
Student Notebook
Uempty
Checkpoint
1. True or False
Applications are defined to HACMP in a configuration file that lists what
binary to use.
AU548.0
Notes:
4-25
Student Notebook
Unit summary
Key points from this unit:
To define an application to HACMP, you must:
Create an application server resource (start and stop scripts)
Create a resource group (node list, policies, resources)
Automation
Dependencies
Interference
Robustness
Implementation details
Monitoring
Shared storage requirements
Environment
Multiple instances
Script location
Error handling
Coding issues
AU548.0
Notes:
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
5-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
State where installation fits in the implementation process
Describe how to install HACMP 5.4.1
List the prerequisites for HACMP 5.4.1
List and explain the purpose of the major HACMP 5.4.1
components
AU548.0
Notes:
What this unit covers
This unit discusses the installation and the code components of HACMP 5.4.1.
5-2
HACMP Implementation
V4.0
Student Notebook
Uempty
5-3
Student Notebook
AU548.0
Notes:
This topic covers the installation of the HACMP 5.4.1 filesets.
5-4
HACMP Implementation
V4.0
Student Notebook
Uempty
Step
1
2
3
4
5
6
7
8
9
10
11
12
Step Description
Plan
Assemble hardware
Install AIX
Configure networks
Configure shared storage
Install HACMP
Define/discover the cluster
topology
Configure application
servers
Configure cluster resources
Synchronize the cluster
Start Cluster Services
Test the cluster
Comments
Use planning worksheets and documentation.
Install adapters, connect shared disk and network.
Ensure you update to the latest maintenance level.
Requires detailed planning.
Set up shared volume groups and filesystems.
Install on all nodes in the cluster (don't forget to install
latest fixes).
Review what you end up with to make sure that it is
what you expected.
You will need to write your start and stop scripts.
Refer to your planning worksheets.
Ensure you "actually" do this.
Watch the logs for messages.
Document your tests and results.
AU548.0
Notes:
Steps to building a cluster
Here are the steps to building a successful cluster. Okay, so we could have included
more steps or combined a few steps, but the principle is that you should plan and follow
a methodical process, which includes eventual testing and documentation of the cluster.
It is often best to configure the clusters resources iteratively. Get basic resource groups
working first, and then add the remaining resources gradually, testing as you go, until
the cluster does everything that it is supposed to do.
Different opinions
Different people have different ideas about the exact order in which a cluster should be
configured. For example, some people prefer to leave the configuration of the shared
storage (step 5 above) until after theyve synchronized the clusters topology (step 7) as
5-5
Student Notebook
this allows them to take advantage of HACMPs C-SPOC facility to configure the shared
storage.
One other area where different views are common is exactly when to install and
configure the application. If the application is installed, configured and tested
reasonably thoroughly prior to installing and configuring HACMP then most issues
which arise during later cluster testing are probably HACMP issues rather than
application issues. The other common perspective is that HACMP should be installed
and configured prior to installing and configuring the applications as this allows the
applications to be installed into the exact context that they will ultimately run in. There is
no correct answer to this issue. When to install and configure the applications is just
one more point that will have to be resolved during the cluster planning process.
5-6
HACMP Implementation
V4.0
Student Notebook
Uempty
Resources:
Application Server
Service labels
Resource group:
Identify name, nodes, policies
Resources: Application Server, service label, VG, filesystem
AU548.0
Notes:
What we have done so far
In the units 2, 3, and 4 we planned and built the storage, network, and application
environments for our cluster. So we are now ready to install the HACMP filesets.
5-7
Student Notebook
Release notes:
On the CD as release_notes
Installed as /usr/es/sbin/cluster/release_notes
AU548.0
Notes:
There are other references
Other HACMP manuals are available which might prove useful. Check out the
references at the start of this unit for a complete list.
5-8
HACMP Implementation
V4.0
Student Notebook
Uempty
pubs
in pdf only
Installp/ppc, usr/sys/inst.images
cluster.adt.es
cluster.doc.en_US.es
cluster.es
cluster.es.cfs
cluster.es.cspoc
cluster.es.nfs
cluster.es.plugins
cluster.es.worksheets
cluster.hativoli
cluster.haview
cluster.license
cluster.man.en_US.es.data
cluster.msg.<lang>.cspoc
cluster.msg.<lang>.es
cluster.msg.<lang>.hativoli
cluster.msg.<lang>.haview
rsct.basic.hacmp.2.4.5.2.bff
rsct.basic.rte.2.4.5.2.bff
rsct.core.errm.2.4.5.1.bff
rsct.core.gui.2.4.5.1.bff
rsct.core.hostrm.2.4.5.1.bff
rsct.core.rmc.2.4.5.2.bff
rsct.core.sec.2.4.5.1.bff
rsct.core.utils.2.4.5.2.bff
rsct.opt.storagerm.2.4.5.2.bff
AU548.0
Notes:
Files on the CD
This visual shows the files that are on the CD. They will be expanded to show the table
of contents when using SMIT to do the install. The AIX 5.2 and 6.1 directories contain
the required rsct filesets for implementing HACMP V5.4.1 with AIX V5.2 and 6.1,
respectively. The pubs directory contains the PDF and HTML versions of the HACMP
documentation at the time the CD was created.
5-9
Student Notebook
cluster.es.cfs
+ 5.4.1.0 ES Cluster File System Support
cluster.es.cspoc
+ 5.4.1.0 ES CSPOC Commands
+ 5.4.1.0 ES CSPOC Runtime Commands
+ 5.4.1.0 ES CSPOC dsh
cluster.es.nfs
+ 5.4.1.0 ES NFS Support
cluster.es.plugins
+ 5.4.1.0 ES Plugins - Name Server
+ 5.4.1.0 ES Plugins - Print Server
+ 5.4.1.0 ES Plugins - dhcp
cluster.es.worksheets
+ 5.4.1.0 Online Planning Worksheets
cluster.hativoli
+ 5.4.1.0 HACMP Tivoli Client
+ 5.4.1.0 HACMP Tivoli Server
cluster.license
+ 5.4.1.0 HACMP Electronic License
cluster.man.en_US.es
+ 5.4.1.0 ES Man Pages - U.S. English
cluster.msg.en_US.cspoc
+ 5.4.1.0 HACMP CSPOC Messages - U.S.
English
AU548.0
Notes:
Fileset considerations
Listed are some of the filesets that you see when doing smit install_all in HACMP 5.4.1.
Using smit install_latest will not show the msg filesets so you should use install_all and
select the filesets.
Notice that cluster.es contains both client and server components. You can install either
or both depending on what the systems HACMP function will be.
When you install cluster.es.server you will get cluster.es.cspoc as well.
The same filesets should be installed on all nodes or Verify will give warnings every
time it executes.
You should install the documentation filesets on at least one non-cluster node (ensuring
that the HACMP PDF-based documentation is available even if none of the cluster
nodes will boot could prove really useful someday).
V4.0
Student Notebook
Uempty
Notice that some of the filesets require other products such as Tivoli or NetView. You
should not install these filesets unless you have these products. HAView is never
installed on the cluster node, it is installed on the NetView server.
The cluster.es.cfs fileset can only be used if GPFS is installed. You might not need the
plug-ins.
The Web-based smit is not to be confused with WebSM. Web-based smit is Web
application that allows you to see the HACMP smit configuration screens and to see
status.
The cluster.es.clvm fileset, which was formerly required for Enhanced Concurrent Mode
volume group support and concurrent mode resource group support, has been
removed. The function required for Enhanced Concurrent Mode volume groups and
concurrent mode resource groups have been built into the HACMP base code. The
license key for concurrent mode resource groups is no longer required.
cluster.adt.es
cluster.doc.en_US.es
cluster.es
cluster.es.cspoc
cluster.license
cluster.man.en_US.es
cluster.msg.en_US.cspoc
cluster.msg.en_US.es
5-11
Student Notebook
SDDPCM
V2.1.1.0 or later, see student notes for details
Copyright IBM Corporation 2008
AU548.0
Notes:
Installation suggestions
Listed above are the minimum prerequisites. As time goes by, these will almost
certainly be superseded by later levels. The point is that these are the components that
must be considered when preparing your environment for HACMP.
Before you try to install, look at the following for the latest prerequisites:
- HACMP for AIX 5L Installation Guide, Version 5.4.1
- Release notes / README on the HACMP for AIX 5L, Version 5.4.1 CDs
- The HACMP for AIX 5L, Version 5.4.1 Announcement Letter
Go to the HACMP web site
http://www-03.ibm.com/systems/p/advantages/ha/ and click the
Announcement Letter link under the heading Learn more on the right side.
V4.0
Student Notebook
Uempty
Because you are unlikely to want to upgrade a new cluster anytime soon, it is generally
wisest to start with the latest available AIX and HACMP patches. The URL for checking
on the latest patches is:
http://www14.software.ibm.com/webapp/set2/sas/f/hacmp/home.html
Finally, its always a good idea to call IBM support and ask if there are any known issues
with the versions of AIX and HACMP that you plan to install/upgrade. Indicate that you
intend to install the latest HACMP PTF (fix pack or whatever it may be called a the time)
and ask if its known to be stable. Depending on the timing of your installation, it might
be advisable to either stay one maintenance level behind on AIX and HACMP or both,
or it might be wise to wait for an imminent maintenance level for AIX and HACMP or
both.
bos.data
bos.net.tcp.client
bos.net.tcp.server
bos.rte.libcur
bos.rte.libpthreads
bos.rte.odm
bos.rte.SRC
Those listed in bold are the ones that needed to be added to a base AIX image. Ensure
that when you install these on a system that has been updated to a Technology Level /
Service Pack (TL / SP), that you update these newly installed HACMP prerequisites to
the same TL / SP.
The base levels needed for AIX 5.3/6.1 are:
bos.adt.libm 5.2.0.85
bos.adt.syscalls 5.2.0.50
bos.data 5.1.0.0
In addition, the following is needed for AIX 6.1 only:
bos.net.nfs.server 5.3.7.0
For the RSCT prerequisites on AIX 6.1, the three filesets that are on the CD are
required in addition to the base RSCT filesets with the AIX 6.1 installation. Place the
RSCT filesets that are on the CD in the same directory as the prerequisites listed on the
slide.
5-13
Student Notebook
IBM High Availability Cluster Multiprocessing (HACMP*) for AIX 5L*, V5.3 and V5.4
updates support for the IBM System Storage SAN Volume Controller (SVC) Storage
Software V4.1.
Please refer to the following information for support details. Note: TL = Technology Level
Table 1: HACMP APARs
HACMP 5.3
HACMP APARs: IY94307, IZ00051+
HACMP 5.4
HACMP APARs: IY87247, IZ00050+
IY95174
<not supported>
SDD v1.6.2.1
IY98568++
IY98751++
IY91487
IY95080
IY98751++
IY98751++
+ These APARs are not yet generally available. Contact IBM Support to obtain fix
packages for these APARs.
++ These APARs will not be made generally available for AIX 5.2 TL 9 and AIX 5.3 TL 5.
Contact IBM Support to obtain Ifix packages for these APARs.
Although it is not required for correct operation with SDDPCM, the configuration of Fast I/O
Failure on Fibre Channel devices is highly recommended. For information on configuring
this feature, refer to Storage Multipath Subsystem Device Driver User's Guide, page 143
at:
http://www-1.ibm.com/support/docview.wss?uid=ssg1S7000303&aid=1
Additional requirement:
The AIX OS error daemon parameters should be tuned to avoid lost log entries (See
documentation APAR IY75323 at
http://www-1.ibm.com/support/docview.wss?uid=isg1IY75323) HACMP requires the buffer
size be set to at least 1 MB and the log size to 10 MB. Use the following command:
errdemon -B 1048576 -s 10485760"
Restriction notes for Metro Mirror:
An HACMP/XD and SVC Metro Mirror configuration with VIO is not currently supported.
Although SVC Host Name Aliases are arbitrary, for HACMP's support of Metro Mirror
they must match the node names used in the defined HACMP sites.
Resource Groups to be managed by HACMP cannot contain volume groups with both
Metro Mirror-protected and non-Metro Mirror-protected disks.
5-14 HACMP Implementation
V4.0
Student Notebook
Uempty
HACMP does not support Global Mirror functions of SVC Copy Services.
HACMP V5.3 does not support moving resource groups across sites.
For specific HACMP C-SPOC restrictions, refer to the HACMP/XD for Metro Mirror:
Planning and Administration Guide.
SDDPCM requires the configuration of Enhanced Concurrent Mode volume groups.
Other notes:
SDD supports both Shared or Enhanced Concurrent Mode volume groups.
Ensure that your SVC is properly configured to support SDD/SDDPCM host
multipathing. This involves adding all the WWWNs for a host's WWPN's into a single
Host object on the SVC. For example, for an HACMP node name Node_A with two
WWPNs of WWNN_1 and WWNN_2, run svctask mkhost -name Node_A -hbawwpn
WWNN_1 WWNN_2
AIX 6.1
SP2
APAR IZ07791
HACMP 5.4.1 w/
RSCT 2.4.5.4
TL7
RSCT 2.5.0.0
SP2
APAR IZ02602
RSCT 2.4.5.4
RSCT 2.5.0.0
5-15
Student Notebook
Network setup
AU548.0
Notes:
Description of checklist
This is a checklist of items that you should verify before starting to configure an HACMP
cluster. It is not a complete list because each situation is different. It would probably be
wise to develop your own checklist during the cluster planning process, and then verify
it just before embarking on the actual HACMP configuration of the cluster.
Code installation
Correct filesets includes making sure that the same HACMP filesets are installed on
each node. Documentation can be installed before installing HACMP. The
documentation is delivered as either pdf only for HACMP 5.4.1, previous versions
provided an html version too.
V4.0
Student Notebook
Uempty
Network setup
The /etc/hosts file should have entries for all IP labels and all nodes. The file should be
the same on all nodes. Name resolution should be tested on all labels and nodes. To do
this you can use the host command. You should test address to name and name to
address and verify that they are the same on all nodes. You should ensure that a route
exists to all logical networks from all nodes. Finally, you should test connectivity by
pinging all nodes from all nodes on all interfaces.
Shared storage
Check to see that the disks are configured and recognized the same (if possible) and
can be accessed from all nodes that will share it.
5-17
Student Notebook
Install prerequisites:
bos.adt.libm
bos.adt.syscalls
bos.data
rsct.compat.clients.hacmp, rsct.compat.basic.hacmp
Configure /usr/es/sbin/cluster/clhosts
Can copy /usr/es/sbin/cluster/etc/clhosts.client
Test connectivity
Copyright IBM Corporation 2008
AU548.0
Notes:
Client machine properties
A client machine is a node running AIX and only the client filesets from HACMP. It can
be used to monitor the cluster nodes as well as to test connectivity to an application
during fallover or to be a machine that is used to access a highly available application
V4.0
Student Notebook
Uempty
Lets review
1. What is the first step in implementing a cluster?
a.
b.
c.
d.
e.
2. True or False?
HACMP 5.4.1 is compatible with any version of AIX V5.x.
3. True or False?
Each cluster node must be rebooted after the HACMP software is
installed.
4. True or False?
You should take careful notes while you install and configure HACMP
so that you know what to test when you are done.
AU548.0
Notes:
5-19
Student Notebook
V4.0
Student Notebook
Uempty
5-21
Student Notebook
AU548.0
Notes:
V4.0
Student Notebook
Uempty
TCP/IP Layer
Manages communication
at the logical level
AU548.0
Notes:
The application layer
The top most layer of the software stack is the application layer. Any application or
service that the cluster node is making highly available is considered to be running at
the application layer (in a sense, this includes rather low-level AIX facilities, such as
NFS, when the cluster is acting as a highly available NFS server).
5-23
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
HACMP components
HACMP consists of the following components:
A cluster manager (recovery driver and resource manager)
RSCT
SNMP related facilities
The Cluster Information Program
A highly available NFS server
Shared external disk access
Cluster Secure Communication Subsystem
5-25
Student Notebook
Cluster Manager
Is a subsystem/daemon that runs on each cluster node
Is primarily responsible for responding to unplanned events:
Recover from software and hardware failures
Respond to user-initiated events:
Request to online/offline a node
Request to move/online/offline a resource group
And so forth
Is a client to RSCT
Provides snmp retrievable status information
Is implemented by the subsystem clstrmgrES
Started in /etc/inittab and always running
AU548.0
Notes:
The cluster managers role
The cluster manager is, in essence, the heart of the HACMP product. Its primary
responsibility is to respond to unplanned events. From this responsibility flows most of
the features and facilities of HACMP. For example, to respond to unexpected events, it
is necessary to know when they occur. This is the job of the RSCT component to
monitor for certain failures.
In HACMP 5.3 and later, the clstrmgrES subsystem is always running.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Introduction to the cluster communication subsystem
The cluster secure communication subsystem is part of HACMP 5.1 and later systems.
It provides connection level security for all HACMP-related communication, eliminating
the need for either /.rhosts files or a Kerberos configuration on each cluster node.
Although only necessary when the configuration of the cluster was being changed, the
need for these /.rhosts files was a source of concern for many customers.
This facility goes beyond eliminating the need for /.rhosts files by providing the ability to
send all cluster communication through a Virtual Private Network (VPN) using
persistent labels. Although unlikely to be necessary in most clusters, this capability will
allow HACMP to operate securely in hostile environments.
In addition, you can use Message-level authentication and Message Encryption or both
in HACMP 5.2 and later. You can have HACMP generate and distribute keys.
5-27
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
clcomd basics
The most obvious part of the cluster secure communication facility is the cluster
communication daemon (clcomd). This daemon replaces a number of ad hoc
communication mechanisms with a single facility thus funneling all cluster
communication through one point. This funneling, in turn, makes it feasible to then use
a VPN to actually send the traffic between nodes and to be sure that all the traffic is
going through the VPN.
5-29
Student Notebook
processes are very efficient. These processes might still take a matter of minutes to
complete as comparison processing and resource manipulation may be occurring.
Other aspects of clcomds implementation which further improve performance include:
- Caching coherent copies of each nodes ODMs, which reduces the amount of
information which must be transmitted across the cluster during a verification
operation
- Maintaining long-term socket connections between nodes avoids the necessity to
constantly create and destroy the short term sessions, which are a natural result of
using rsh and other similar mechanisms
V4.0
Student Notebook
Uempty
AU548.0
Notes:
How clcomd authentication works
If the source node of the communication is not in the HACMPadapter and
HACMPnode ODM files on the target node, the target clcomd daemon authenticates
the in-bound session by checking the sessions source IP address against a list of
addresses in /usr/es/sbin/cluster/etc/rhosts and the addresses configured into the
cluster itself (in other words, in the previously mentioned ODM files). To defeat any
attempt at IP-spoofing (a very timing-dependent technique which involves faking a
sessions source IP address), each non-callback session is checked by connecting
back to the source IP address and verifying who the sender is. If the source node is in
the HACMPadapter and HACMPnode ODM files on the target node, the target clcomd
daemon only uses the information about the source node from these ODM files to
conduct the authentication.
The action taken to a request depends on the state of the
/usr/es/sbin/cluster/etc/rhosts file as shown in the visual. If a cluster node is being
Copyright IBM Corp. 1998, 2008
5-31
Student Notebook
moved to a new cluster or if the entire cluster configuration is being redone from
scratch, it might be necessary to empty /usr/es/sbin/cluster/etc/rhosts or manually
populate it with the IP addresses of the source node. Subsequently, the file can be
emptied, because all clcomd communications will be authenticated based on the
HACMP ODM files. The file must exist, or clcomd will fail to allow any inbound
communications. In fact, testing has shown that once the nodes are established in the
HACMP ODM, you can put anything in the /usr/sbin/cluster/etc/rhosts file and it will
be ignored. Again, the key thing is that the file exists and that the HACMP ODM
contains the node/adapter information for the source of the clcomd session.
V4.0
Student Notebook
Uempty
RSCT
Is included with AIX
Provides:
Scalability to large clusters
Cluster failure notification
Coordination of changes
Group Services
Coordinates and monitors state changes of an application in the cluster
AU548.0
Notes:
What RSCT provides
RSCTs role in an HACMP cluster is to provide:
- Failure detection and diagnosis for topology components (nodes, networks, and
network adapters)
- Notification to the cluster manager of events that it has expressed an interest in primarily events related to the failure and recover of topology components
- Coordination of the recovery actions involved in dealing with the failure and recovery
of topology components (in other words, fallovers, fallbacks and dealing with
individual NIC failures by moving or swapping IP addresses)
5-33
Student Notebook
Database
Resource
Monitor
RSCT
RMC
(ctrmc)
HA Recovery Driver
~
Cluster
Manager
RSCT
Topology
Services
RSCT
Group
Services
Recovery
Programs
Switch
Resource
Monitor
heartbeats
messages
Recovery
Commands
~
HACMP Event
Scripts
Group Membership
Event Subscription
Voting Protocols between nodes
AU548.0
Notes:
The RSCT environment
This diagram includes all of the major RSCT components plus the HACMP cluster
manager and event scripts. It also illustrates how they communicate with each other.
Topology services
Responsible for building heartbeat rings for the purpose of detecting, diagnosing, and
reporting state changes to the RSCT Group Services component, which in turn reports
them to the Cluster Manger. Topology Services is also responsible for the transmission
of any RSCT-related messages between cluster nodes.
Group services
Associated with RSCT Topology Services is the RSCT Group Services daemon which
is responsible for coordinating and monitoring changes to the state of an application
5-34 HACMP Implementation
V4.0
Student Notebook
Uempty
running on multiple nodes. In the HACMP context, the application running on multiple
nodes is the HACMP cluster manager. Group Services reports failures to the Cluster
Manager as it becomes aware of them from Topology Services. The Cluster Manager
then drives cluster-wide coordinated responses to the failure through the use of Group
Services voting protocols.
Monitors
The monitors in the upper left of the diagram monitor various aspects of the local nodes
state, including the status of certain processes (for example, the application if
application monitoring has been configured), database resources, and the SP Switch (if
one is configured on the node). These monitors report state changes related to
monitored entities to the RSCT RMC Manager.
RMC manager
The RSCT RMC Manager receives notification of events from the monitors. It analyzes
these events and notifies RSCT clients of those events which they have expressed an
interest in.
The HACMP cluster manager, an RSCT client, registers itself with both the RSCT RMC
Manager and the RSCT Group Services components.
Cluster manager
After an event has been reported to the HACMP Cluster Manager, it responds to this
event the use of HACMPs recovery commands and event scripts to respond to the
event. The scripts are coordinated via the RSCT group services component.
5-35
Student Notebook
Heartbeat rings
Heartbeat
25.8.60.6
25.8.60.5
25.8.60.2
25.8.60.4
25.8.60.3
AU548.0
Notes:
RSCT topology services functions
The RSCT Topology Services component is responsible for the detection and diagnosis
of topology component failures. As discussed in the networking unit, the mechanism
used to detect failures is to send heartbeat packets between interfaces. Rather than
send heartbeat packets between all combinations of interfaces, the RSCT Topology
Services component sorts the IP addresses of the interfaces on a given logical IP
subnet and then arranges to send heartbeats in a round robin fashion from high to low
IP addresses in the sorted list. For non-IP networks (like rs-232 or Heartbeat on Disk),
addresses are assigned to the adapters that form the endpoints of the network and
are used by Topology Services like IP addresses for routing/monitoring the heartbeat
packets.
V4.0
Student Notebook
Uempty
Example
For example, the IP addresses in the foil can be sorted as 25.8.60.6, 25.8.60.5,
25.8.60.4, 25.8.60.3 and 25.8.60.2. This ordering results in the following heartbeat path:
25.8.60.6 --> 25.8.60.5-->25.8.60.4-->25.8.60.3-->25.8.60.2-->25.8.60.6
5-37
Student Notebook
AU548.0
Notes:
HACMP support of SNMP
In HACMP 5.3 and later SNMP manager support is provided by the cluster manager
component. This SNMP manager allows the cluster to be monitored via SNMP queries
and SNMP traps. In addition, HACMP includes an extension to the Tivoli NetView
product called HAView. This extension can be used to make Tivoli NetView
HACMP-aware. The clinfo daemon as well as any SNMP manager and the snmpinfo
command can interface to this SNMP manager. This is discussed in more detail in the
course HACMP Administration II: HACMP Administration and Problem Determination.
V4.0
Student Notebook
Uempty
Is used by:
The clstat command
Customer written utility/monitoring tools
AU548.0
Notes:
What the clinfo daemon provides
The clinfo daemon provides an interface (covered in Unit 3) for dealing with ARP cache
related issues as well as an Application Program Interface (API) which can be used to
write C and C++ programs, which meet customer-specific needs related to monitoring
the cluster.
5-39
Student Notebook
Starting clinfo
Starting clinfo on an HACMP server node:
The clinfo daemon can be started in a number of ways (see the HACMP
Administration Guide) but probably the best way is to start it along with the rest of
the HACMP daemons by setting the Startup Cluster Information Daemon? field to
true when using the smit Start Cluster Services screen (which will be discussed in
the next unit).
Note that an option exists in HACMP 5.4.1 and later to start clinfo for consistency
groups. This support is for HACMP/XD with Metro Mirror replication.
Starting clinfo on a Client:
Use the /usr/es/sbin/cluster/etc/rc.cluster script or the startsrc command to start
clinfo on a client.
You can also use the standard AIX startsrc command startsrc -s clinfoES
V4.0
Student Notebook
Uempty
NFS V2/V3
HACMP preserves file locks and dupcache across fallovers
Limitations
Lock support is limited to two node clusters
Resource group is only active on one node at a time
NFS V4
It requires Stable Storage location accessible from all nodes in the resource group
Resource Group can have more than two nodes
NFSv4 application server and monitor are automatically added
It requires a new fileset be installed
AU548.0
Notes:
HACMP NFS V2/V3 support
The HACMP software provides the following availability enhancements to NFS V2/V3
operations:
- Reliable NFS server capability that allows a backup processor to recover current
NFS activity should the primary NFS server fail, preserving the locks on NFS
filesystems and the duplicate request cache
- Ability to specify a network for NFS mounting
- Ability to define NFS exports and mounts at the directory level
- Ability to specify export options for NFS-exported directories and filesystems
5-41
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Shared disk support
As you know by now, HACMP supports shared disks. See the shared storage unit for
more information on HACMPs shared external disk support. Recall that enhanced
concurrent mode can be used in a non-concurrent mode to provide heartbeat over disk
and fast disk takeover for resource group policies where the resource group is active on
only one node at a time.
Note that the bos.clvm.enh fileset is required for enhanced concurrent support even if
using it in non-concurrent mode.
5-43
Student Notebook
Checkpoint
1. Which component detects an adapter failure?
a.
b.
c.
d.
Cluster Manager
RSCT
clcomd
clinfo
Cluster Manager
RSCT
clsmuxpd
clinfo
Cluster Manager
RSCT
clcomd
clinfo
Cluster Manager
RSCT
clcomd
clinfo
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Having completed this unit, you should be able to:
Explain where installation fits in the implementation process
Describe how to install HACMP 5.4.1
List the prerequisites for HACMP 5.4.1
Describe the installation process for HACMP 5.4.1
List and explain the purpose of the major HACMP 5.4.1
components
AU548.0
Notes:
5-45
Student Notebook
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1 Installation Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1 Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1 Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1 Troubleshooting Guide
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
6-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
AU548.0
Notes:
Objectives
This unit will show how to configure a 2-node hot-standby or mutual takeover cluster
with a heartbeat over disk non-IP network using the standard configuration menus.
Follow the markers at the bottom of the screens to see the steps to extend the basic
hot-standby to a mutual takeover. It will then demonstrate how to start up and shut
down Cluster Services. It will then show the steps necessary to modify the configuration
of the cluster to add a persistent IP label, add a heartbeat on disk non-IP network and
synchronize the changes. The final step is making a snapshot backup of the cluster
configuration.
You will be walked through the methods of configuring the cluster, using the Initialization
and Standard Configuration path. You will make the above mentioned extensions using
the Extended Configuration path. You will also see the simplest, most limited, method;
that is, the Two-Node Configuration Assistant.
6-2
HACMP Implementation
V4.0
Student Notebook
Uempty
uk
ukadm
usa
usaadm
Two non-IP:
heartbeat on disk
rs-232
AU548.0
Notes:
Configuring either a standby or a mutual takeover configuration
During this course, you and your team will configure a two-node cluster. You will be
guided through the process of creating a mutual takeover cluster using the standard
path. To adapt this to a hot-standby cluster, omit the steps that involve creating the
second resource group and its content. The standard path is ideal for creating a cluster
because it gives you the ability to use the pick lists and it automates some steps. It
requires that you have a solid understanding of your environment and the way HACMP
works to successfully configure the cluster.
The two-node assistant is mentioned in the lecture. It can be used to create a simple
hot-standby cluster, with one resource group only. That one resource group will contain
all the non-rootvg volume groups present on the node where the configuration is being
done.
6-3
Student Notebook
The X in the figure represents the application xwebserver and the arrow represents
what happens on a fallover. The persistent addresses and both non-IP networks that
will be added in this unit are also shown.
The cluster will be tested for reaction to node, network, and network adapter failure, and
later in the week, we will also configure additional features, including NFS export and
cross-mount.
6-4
HACMP Implementation
V4.0
Student Notebook
Uempty
Synchronize
AU548.0
Notes:
Ready for configuration
Now that the HACMP filesets are installed, we can start to configure HACMP.
6-5
Student Notebook
usaboot1
usaboot2
usaadm
ukboot1
ukboot2
ukadm
xweb
#
#
#
#
#
#
#
#
Hostnames: usa, uk
usa's network configuration (defined via smit chinet):
en0 - 192.168.15.29
en1 - 192.168.16.29
AU548.0
Notes:
A sample network configuration
Every discussion must occur within a particular context. The above network
configuration is the context within which the first phase of this unit will occur. Refer back
to this page as required over the coming visuals.
Note that the addresses are set to support IPAT via Aliasing. The service address would
have been on the same subnet as one of the boot adapters if IPAT via Replacement
was to be used.
Also note that an understanding of the physical layout of the adapters in each system is
critical to ensure that the cable attachments are going to the correct enX in AIX. This is
true whether youre dealing with a standalone system or an LPAR with adapters in
drawers. It is obviously not an issue with virtual adapters.
6-6
HACMP Implementation
V4.0
Student Notebook
Uempty
Configuration methods
HACMP provides two menu paths with three methods to
configure topology and resources:
Initialization and Standard Configuration
Two-node cluster configuration assistant
Limited configuration
> Only supports two-node hot standby cluster
Standard configuration
Topology done in one step
> Based on IP addresses configured
Extended Configuration
More steps, but provides access to all the options
Copyright IBM Corporation 2008
AU548.0
Notes:
Configuration methods
- Standard Configuration
With this method, you must do the following tasks:
i. Topology (simplified via Configure an HACMP Cluster and Nodes)
ii. Configure Resources and Resource Groups
iii. Verify and Synchronize
- Two-Node Cluster Configuration Assistant
With this method, all the steps of Standard Configuration are done at once, including
adding a non-IP disk heartbeat network if you created an enhanced concurrent
volume group. Note that this is a simple two-node configuration with one resource
group containing all configured volume groups. This can be a starting point for
creating a more robust cluster but should not be viewed as a shortcut to creating a
cluster without a thorough understanding of how HACMP works.
Copyright IBM Corp. 1998, 2008
6-7
Student Notebook
- Extended Configuration
With this method you follow similar steps as the Standard Configuration but
Topology has more steps and there are many more options. Some options can only
be done using this method, such as adding a non-IP network.
6-8
HACMP Implementation
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Base configuration
Prior to using any of the methods to configure the cluster, there are basic AIX
configuration steps that must be performed. As described in the unit on networking
considerations, you chose IP addresses and subnets to match your IP Address
Takeover method. Now you must ensure that those boot addresses are configured on
each of the cluster nodes network adapters. Take care to ensure that you have
configured these addresses correctly, including the subnet mask. When using either the
Two-node assistant or the Standard path, the addresses on the adapters for all the
systems being configured into the cluster are used to create the adapter and network
objects in the HACMP ODM. A simple mistake here will result in incorrect network
configurations in HACMP.
Next, you must ensure that all the addresses, boot, service, and persistent, are in the
/etc/host files for all the systems in the cluster. Check for resolution, forward and
reverse, by address and by name on all the systems in the cluster. Then verify that you
Copyright IBM Corp. 1998, 2008
6-9
Student Notebook
can reach all the boot addresses from each system via ping (including the local
addresses).
Now switch to your storage configuration. To instruct HACMP to manage your
applications volume groups, you must add those volume groups to resource groups. To
minimize risk of error in data entry, add the volume groups to the resource groups using
a pick list. To do that, the volume groups must be configured prior to the resource group
configuration (and an HACMP discovery must be done). If you use the two-node
assistant, all volume groups (other than rootvg) will be picked up and used in the
resource group that is configured. Take caution here. If you use the Standard path, you
choose the volume groups to place in the resource groups. Choosing them from a pick
list is the right approach. Configure at least one Enhanced Concurrent Mode Volume
Group for use in a heartbeat on disk non-IP network.
You would ensure that the start and stop scripts were placed on all the systems in the
cluster and that you specify interface name/address for all the other systems when
configuring the cluster. In our example youd ensure that the scripts were on usa and
that you chose an interface name/address for the other node (uk).
You will choose application server and resource group names when you configure them
using Initialization and Standard Configuration. As you will see a little later, if you use
the Two-node Configuration Assistant, the application server name will be used to
generate the HACMP names for the cluster and resource group.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The main HACMP smit menu
This is the top level HACMP smit menu. Youll find it often simplest to get here using the
smit fastpath shown above.
As implied by the # prompt, there is little point in being here if you dont have root
privileges!
6-11
Student Notebook
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The initialization and standard configuration menu
This method is preferred for all initial cluster configurations except for the most simple
two-node, hot-standby, one resource group, one volume group configuration. For those
simpler configurations, you can use the Two-node Configuration Assistant. Regardless
of your cluster complexity, it is good practice to use the Initialization and Standard
Configuration path (referred to as the Standard path) for all your cluster configurations
because it requires you to be aware of the details of your configuration. The importance
of this cant be underestimated.
Configuration changes made using the HACMP Standard path smit screens do not take
effect until they are verified and synchronized (see the third from the bottom selection in
this menu). Instead, they are managed on the node from which the configuration work is
performed. During synchronization, the files are propagated to the other nodes and will
cause HACMP to be dynamically reconfigured if there are active cluster nodes. More
about dynamic reconfiguration in a later unit.
6-12 HACMP Implementation
V4.0
Student Notebook
Uempty
Note however that the Two-node Configuration Assistant does do the synchronization
step. More on the Two-node Configuration Assistant later.
The method
The menu shows the tasks as they are to be performed. Each follows the other with
each having submenus to be traversed.
You will start by configuring the cluster itself. This is done via the Configure an HACMP
Cluster and Nodes option. This will build the cluster, nodes, adapters and network
objects for IP based networks. Non-IP networks will be added later.
That is followed by the configuration of the resources that will be made highly available.
This includes the service addresses, application servers (specifying your application
start and stop script names) and the option to use C-SPOC to create your shared LVM
structures. This is done via the Configure Resources to Make Highly Available
option.
To make the resources available to HACMP for management, you must put them in
resource groups. You will create resource group(s) objects and then fill them with the
resources that were defined above. There will be nodes listed in a specific order for
acquiring the resources and the service addresses and volume groups that support the
application. This is done via the Configure HACMP Resource Groups option.
Caution
If changes are made on one node but not synchronized and then more changes are
made on a second node and then synchronized, the changes made on the first node
are lost.
If you want to avoid losing work, make sure that you dont flip back and forth between
nodes while doing configuration work (that is, work on only one node at least until
youve synchronized your changes).
Recommendation
Pick one of your cluster nodes to be the one node that you use to make changes.
Configuration assistants
Besides the Two-node Configuration Assistant, HACMP provides, via an additional
feature, configuration assistants for WebSphere, Oracle, and DB2, called Smart
Assistants.
6-13
Student Notebook
* Cluster Name
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Input for the standard configuration method
Assuming your network planning and setup was done correctly, you need only decide
on a name for the cluster and choose one IP address/hostname for each node that will
be in the cluster, including the node where you see this screen. This is not necessarily
the HACMP node name that will be assigned to the node, it is only a
resolvable/reachable address that can be used to gather information for the creation of
the HACMP topology configuration. The hostname of each node that is found is used as
the HACMP node name.
Notice that you can select the interfaces from a pick list (from the local /etc/hosts file)
and at this point in time, the Currently Configured Node(s) field is empty.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Output from standard configuration
This step has created the cluster, an IP network and non-service IP labels (boot
addresses). The network objects are created based on the addresses/subnet masks
that are configured on the adapters in the nodes specified in the previous screen. This
exists only on the node where the command was run. Later we will see the
synchronization process.
Notice that there is no non-IP network and there are no resources and no resource
groups yet when using the standard configuration method.
6-15
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Not done yet
Because Configure an HACMP Cluster and Nodes does the topology, only there is
more to do using the Standard path:
- Application Server and Service Address are Resources must be created using the
Configure Resources to Make Highly Available.
- Resource group with policies and attributes must be created using the Configure
HACMP Resource Groups.
- Extended Configuration method must be used to add non-IP heartbeat networks.
- The cluster definitions must be propagated to the other nodes using Verify and
Synchronize.
V4.0
Student Notebook
Uempty
Game plan
These steps will follow, starting with the Configure Resources to Make Highly
Available option.
6-17
Student Notebook
Service IP Labels/Addresses
Application Servers
Volume Groups, Logical Volumes and Filesystems
Concurrent Volume Groups and Logical Volumes
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The first step in definition of highly available resources
Again, the process will be to follow the menus. The first step is to define the Service IP
labels.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The Configure Service IP Labels/Addresses menu
This is the menu for managing service IP labels and addresses within the standard
configuration path. To define a new service label, choose Add a Service IP
Label/Address.
6-19
Student Notebook
* IP Label/Address
* Network Name
+
+
+--------------------------------------------------------------------------+
IP Label/Address
(none)
((none))
usaadm
(192.168.5.29)
ukadm
(192.168.5.31)
yweb
(192.168.5.70)
xweb
(192.168.5.92)
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
/=Find
n=Find Next
+--------------------------------------------------------------------------+
AU548.0
Notes:
Selecting the service label
This is the HACMP smit screen for adding a service IP label in the standard
configuration path.
The popup for the IP Label/Address field gives us a list of the IP labels that were found
in /etc/hosts but not associated with NICs.This could be quite a long list depending on
how many entries there are in the /etc/hosts file. Although, in practice, the list is fairly
short as /etc/hosts on cluster nodes tends to only include IP labels which are important
to the cluster.
The service IP label that we intend to associate with the xweb resource groups
application is xweb.
V4.0
Student Notebook
Uempty
* IP Label/Address
* Network Name
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
+
+
F4=List
F8=Image
AU548.0
Notes:
Choosing the network name
Although not shown, another menu will display prompting you to choose the network to
which this Service IP label belongs. The automatically generated network names are a
bother to type so weve used the popup list which contains the only IP network defined
on this cluster.
Notice that the popup list entry names the network and indicates the IP subnets
associated with each network. This is potentially useful information at this point as we
must specify a service IP label, which is not in either of these subnets to satisfy the
rules for IPAT via IP aliasing.
Menu filled in
This screen shows the parameters for the xweb resource groups service IP label.
When were sure that this is what we intend to do, press Enter to define the service IP
Copyright IBM Corp. 1998, 2008
6-21
Student Notebook
label. The label is then available from a pick list when you add resources to a resource
group later.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
Service IP Labels/Addresses
Application Servers
Volume Groups, Logical Volumes and Filesystems
Concurrent Volume Groups and Logical Volumes
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The next step is definition of highly available resources
Continuing to follow the menus, the next step is to define the application servers.
6-23
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Configuring the application server resource
Weve now got to define an Application Server for the xweb resource group. This
Configure Application Servers menu displays under the Configure Resources to
Make Highly Available menu in the standard configuration path.
V4.0
Student Notebook
Uempty
* Server Name
* Start Script
* Stop Script
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Filling out the add application server menu
An application server has a name and consists of a start script and a stop script. Use
full path for the script names. The server name is then available from a pick list when
adding resources to a resource group later.
6-25
Student Notebook
The start script should then start the application. It is a good idea to then wait until it is
sure that the application has completely started. The cluster manager doesnt verify that
the application has started or that the start script exits with a 0 return code. Of course if
you configure an application monitor, the cluster manager will monitor the startup and/or
the continuous running of the application. Application monitors are not covered in this
class. They are covered in detail in the HACMP System Administration II class, AU610.
The stop scripts responsibility is to stop the application. It must not exit until the
application is totally stopped as HACMP will start to unmount filesystems and release
other resources as soon as the stop script terminates. The attempt to release these
resources might fail if there are remnants of the application still running.
The start and stop scripts must exist and be executable on all cluster nodes defined in
the resource group (that is, they must reside on a local non-shared filesystem) or you
will not be able to verify and synchronize the cluster. If you are using the auto-correction
facility of verification, the start/stop scripts from the node where they exist will be copied
to all other nodes.
HACMP 5.2 and later provides a file collection facility to help keep the start and stop
scripts in synch. Be sure this is what you want. In most cases this is acceptable.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
Service IP Labels/Addresses
Application Servers
Volume Groups, Logical Volumes and Filesystems
Concurrent Volume Groups and Logical Volumes
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Volume group creation
Planning your volume groups is critical as weve discussed in a previous unit. Creating
the volume groups can be done outside of the cluster configuration process or
integrated. The process used when following along the Standard path is through
C-SPOC. Well be discussing C-SPOC a little later. It is recommended that you use
C-SPOC to create your volume group definitions whether you do it at this point or
independent of the cluster configuration process. We will learn much more about the
process youd use if you chose to define your volume groups here, later in the course.
6-27
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Now run discovery
If you chose to create groups, then you need to re-generate the pick lists. This requires
using the Extended Configuration menu shown above. This applies to network objects
as well as LVM objects.
Pick list information is kept in flat files. The volume group information is in
/usr/es/sbin/cluster/etc/config/clvg_config. The IP information is kept in
/usr/es/sbin/cluster/etc/config/clip_config.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Menu to add a resource group
Now, we are ready to create the xwebgroup Resource Group definition.
6-29
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Filling out the Add a Resource Group menu
Well call this resource group xwebgroup. It will be defined to operate on two nodes:
usa and uk. The order is important, with usa being the home or highest priority node.
The policies will be chosen as listed in the visual. Depending on the type of resource
group and how it is configured, the relative priority of nodes within the resource group
might be quite important.
V4.0
Student Notebook
Uempty
+----------------------------------------------------------------------+
xwebgroup
|
ywebgroup
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
/=Find
n=Find Next
+----------------------------------------------------------------------+
Copyright IBM Corporation 2008
AU548.0
Notes:
Selecting the resource group
Heres the Configure HACMP Resource Groups menu in the standard configuration
path. This menu is found under the standard configuration paths top level menu.
Select the Change/Show Resources for a Resource Group (standard) to get
started. When the Select a Resource Group popup appears, select which resource
group you want to work with and press Enter.
6-31
Student Notebook
Service IP Labels/Addresses
Application Servers
Volume Groups
Use forced varyon of volume groups, if necessary
Filesystems (empty is ALL for VGs specified)
F1=Help
F5=Reset
F9=Shell
[Entry Fields]
xwebgroup
usa uk
F2=Refresh
F6=Command
F10=Exit
[xweb]
[xwebserver]
[xwebvg]
false
[]
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Filling out the Change/Show All Resources and Attributes for a
Resource Group menu
This is the screen for showing/changing resources in a resource group within the
standard configuration path. There really arent a lot of choices to be made: xweb is the
service IP label we created earlier and xwebserver is the application server that we just
defined. xwebvg is a shared volume group containing a the filesystems needed by the
xwebserver application. We could specify the list of filesystems in the Filesystems field
but the default is to mount/unmount all filesystems in the volume group. Not only is this
what we want, but very practical because its easier to maintain over time. This way you
dont have to continue to update the resource group as you add filesystems to the
volume group.
Remember to press Enter to actually add the resources to the resource group.
V4.0
Student Notebook
Uempty
Although the Extended Path hasnt been covered in detail, the configuring of resources
in a Resource Group can be done through that path. When using the Extended Path,
there are many more options. Make note of this as you may want to check this in the lab
or may need to know this when configuring your cluster at home.
6-33
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Using the standard configuration to synchronize and test
After youve defined or changed the clusters topology or resources or both, you need
to:
- Verify and synchronize your changes
- Test your configuration
V4.0
Student Notebook
Uempty
verification errors but only if you are using Extended Configuration. Deciding to do so is
a decision that must be approached with the greatest of care, because it is very unusual
for a verification error to occur that can be safely overridden.
Also, remember the earlier discussion about synchronization--any HACMP
configuration changes made on any other cluster node will be lost if you complete a
synchronization on this cluster node.
Log files are created to show progress and problems. Check /var/adm/hacmp/clverify
for the logs. These log files have been vastly improved over the years with more details
on the commands being run during verify to help in determining the problems
encountered during verify.
6-35
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Its a start
We have accomplished a large portion of the cluster configuration. The nodes, IP
addresses (service and boot), networks, application servers, volume groups, and
resource groups have been configured. This configuration has been synchronized
across the cluster nodes. We indicate that some level of testing could be performed at
this point. You can wait until after we do the rest of the configuration to test everything,
or break it up as we have it here.
Whats left?
Recall the strong recommendation to include at least one non-IP network in your
cluster? Well, we havent done that yet. And what about access to the cluster nodes
using a reliable non-service, non-boot IP address? We can accomplish that with a
persistent address. Finally, it is always a good idea to create backups after producing
6-37
Student Notebook
this much good work. It is advisable to create both a snapshot of the cluster
configuration and a mksysb of the systems.
V4.0
Student Notebook
Uempty
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Reasons to use extended path
Heres the top-level extended configuration path menu. We need to pop over to this
path in order to perform some steps that cannot be done using the Standard
Configuration such as defining a non-IP network, adding a persistent label and saving
the configuration data. We will explore these steps in this unit.
Extended Configuration is also required for configuring IPAT via Replacement and
Hardware Address Takeover as well as defining an SSA heartbeat network. These are
not discussed in this course. Appendix C covers IPAT via Replacement and Hardware
Address Takeover, and Appendix D covers SSA heartbeat networks.
Finally, other reasons for using the Extended Path will be covered in the course HACMP
Administration II: Administration and Problem Determination (AU61).
6-39
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Getting to the non-IP network configuration menus
Non-IP networks are elements of the clusters topology; so were in the topology section
of the extended configuration paths menu hierarchy.
A non-IP network is defined by specifying the networks end-points. These end-points
are called communication devices; so we have to head down into the communication
Interfaces/devices part of the extended topology screens.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The communication interfaces and devices menu
This is the communication and devices part of the extended configuration path. We will
select the Add Communication Interfaces/Devices option.
6-41
Student Notebook
+--------------------------------------------------------------------------+
Select a category
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Deciding which Add to choose
The first question we encounter is whether we want to add discovered or pre-defined
communication interfaces and devices. The automatic discovery that was done when
the added the cluster nodes earlier would have found the rs232/hdisk devices; so we
pick the Discovered option.
V4.0
Student Notebook
Uempty
+--------------------------------------------------------------------------+
Select a category
Communication Interfaces
Communication Devices
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Is it an interface or a device?
Now we need to indicate whether we are adding a communication interface or a
communication device. Non-IP networks use communication devices as end-points
(dev/tty, for example); so select Communication Devices to continue.
6-43
Student Notebook
# Node
Device
Pvid
>
usa
hdisk5
000b4a7cd10c73d78
uk
hdisk5
000b4a7cd10c73d78
>
usa
/dev/tty1
uk
/dev/tty1
usa
/dev/tmssa1
uk
/dev/tmssa2
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
n=Find Next
F1 /=Find
F9+--------------------------------------------------------------------------------------------+
Copyright IBM Corporation 2008
AU548.0
Notes:
Were now presented with a list of the discovered communication devices.
You can either choose to add an rs232 (using the /dev/tty entries) network or a diskhb
network (using the /dev/hdisk entries). If youre interested, we cover SSA in Appendix D.
rs232 networks
The steps to follow to create and test the rs232 network:
a. /dev/tty1 on usa is connected to /dev/tty1 on uk using a fully wired rs232 null-modem
cable (dont risk a potentially catastrophic partitioned cluster by failing to configure a
non-IP network or by using cheap cables). Select these two devices, and press
Enter to define the network.
b. Before you use this smit screen to define the non-IP network, make sure that you
verify that the link between the two nodes is actually working.
c. For our example, the non-IP rs232 network connecting usa to uk can be tested as
follows:
6-44 HACMP Implementation
V4.0
Student Notebook
Uempty
i.
Issue the command stty < /dev/tty1 on one node. The command should
hang.
ii. Issue the command stty < /dev/tty1 on the other node. The command should
immediately report the ttys status and the command that was hung on the first
node should also immediately report its ttys status.
iii. These commands should not be run while HACMP is using the tty.
iv. If you get the behavior described above (especially including the hang in the first
step that recovers in the second step), then the ports are probably connected
together properly (check the HACMP log files when the cluster is up to be sure).
If you get any other behavior then you probably are using the wrong cable or the
rs232 cable isnt connected the way that you think it is).
diskhb networks
The steps to follow to configure and test a Heartbeat on Disk network:
a. Make sure you choose a pair of entries (such as /dev/hdisk5 shown in the figure),
one for each of two nodes. Note that it is actually the pvids that must match since
this is the same disk.
b. You can test the connection using the command /usr/sbin/rsct/bin/dhb_read as
follows:
- On Node A, enter dhb_read -p hdisk5 -r
- On Node B, enter dhb_read -p hdisk5 -t
- You should then see on both nodes: Link operating normally.
6-45
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Benefits/risks on using persistent IP labels
Defining a persistent node IP label on each cluster node allows the cluster
administrators to contact specific cluster nodes (or write scripts which access specific
cluster nodes) without needing to worry about whether the service IP address is
currently available or which node it is associated with.
The (slight) risk associated with persistent node IP labels is that users might start using
them to access applications within the cluster. You should discourage this practice as
the application might move to another node. Instead, users should be encouraged to
use the IP address associated with the application (that is, the service IP label that you
configure into the applications resource group). Also, be careful if you decide to put the
persistent address on the same subnet as the service address for an application that
might be hosted. This could cause some application traffic to use the persistent
address/interface causing unpredictable behavior.
V4.0
Student Notebook
Uempty
+-------------------------------------------------------------------------------------+
Select a Node
usa
uk
F2=Refresh
F3=Cancel
F1=Help
F8=Image
F10=Exit
Enter=Do
/=Find
n=Find Next
+-------------------------------------------------------------------------------------+
AU548.0
Notes:
First, you select a node
Selecting the Add a Persistent Node IP Label/Address choice displays this prompt for
which node wed like to define the address on.
One Persistent Address is supported per network. Each node can have a Persistent
Address or Addresses defined, but it isnt required.
6-47
Student Notebook
[Entry Fields]
usa
[net_ether_01] +
[usaadm]
+
F1=Help
F5=Reset
F9=Shell
F3=Cancel
F7=Edit
Enter=Do
F2=Refresh
F6=Command
F10=Exit
F4=List
F8=Image
AU548.0
Notes:
Filling out the Add a Persistent Node IP Label/Address menu
When youre on this screen, select the appropriate IP network from the Network Name
and IP Label/Address that you want to use from the pick lists.
You can repeat these persistent menus to choose a persistent label for the other nodes.
Press Enter to finish the operation.
V4.0
Student Notebook
Uempty
Synchronize
smitty hacmp -> Extended Configuration
Extended Verification and Synchronization
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
[Both]
+
[No]
+
[No]
+
[No]
+
[Standard] +
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
The Extended Verification and Synchronization menu
This time the extended configuration paths HACMP Verification and Synchronization
screen was chosen. When the extended path version is chosen, it presents a
customization menu (shown above) which the standard path does not do:
Verify, Synchronize, or Both - This option is useful to verify a change without
synchronizing it (you might want to make sure that what you are doing makes sense
without committing to actually using the changes yet). Synchronizing without verifying is
almost certainly a foolish idea except in the most exotic of circumstances.
Automatically correct errors found during verification? - This option is discussed in
the unit on problem determination. This feature can fix certain errors that clverify
detects. By default it is turned off. This option only displays if cluster services are not
started.
6-49
Student Notebook
Force synchronization if verification fails? - This is almost always a very bad idea.
Make sure that you really and truly must set this option to Yes before doing so.
Verify changes only? - Setting this option to Yes will cause the verification to focus
on aspects of the configuration that changed since the last synchronization. As a result,
the verification will run slightly faster. This might be useful during the mid to early stages
of cluster configuration. It seems rather risky once the cluster is in production.
Logging - You can increase the amount of logging related to this verification and
synchronization by setting this option to Verbose. This can be quite useful if you are
having trouble figuring out what is going wrong with a failed verification.
V4.0
Student Notebook
Uempty
[Entry Fields]
[]
[]
No
[]
/
+
+
AU548.0
Notes:
Saving the cluster configuration
You can save the cluster configuration to a snapshot file or to an XML file. The cluster
can be restored either from the snapshot file for the xml file. The xml file can also be
used with the online planning worksheets and potentially with other applications. This
visual looks at the snapshot method and the next visual looks at the XML method.
Creating a snapshot
smit hacmp -> Extended Configuration -> Snapshot Configuration
A snapshot captures the HACMP ODM files, which allows you to recover the cluster
definitions. There is also an info file. The info file is discussed further in the AU61
course HACMP Administration II: Administration and Troubleshooting.
If necessary there is, from the Snapshot Configuration menu, another option to restore
(apply) a snapshot.
Copyright IBM Corp. 1998, 2008
6-51
Student Notebook
* File Name
Cluster Notes
Snapshot Configuration
AU548.0
Notes:
Creating the xml file
Using Extended Configuration, you can save the cluster configuration directly to an xml
file via the menu Export Definition File for Online Planning Worksheets or from a
snapshot via the Snapshot Configuration menu Convert Existing Snapshot For
Online Planning Worksheets.
When created, you can use the Online Planning worksheets to get an updated view of
the configuration or change the configuration or both. The xml file can potentially be
used from other applications or manually to make and display configuration information.
This will be explored in the lab exercise for this course. For the moment, in case you
want to know the command to apply an xml file, it is
/usr/es/sbin/cluster/utilities/cl_opsconfig
V4.0
Student Notebook
Uempty
Communication Path
Application Server
Application Server
Application Server
Service IP Label
F1=Help
F5=Reset
F9=Shell
to Takeover Node
Name
Start Script
Stop Script
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
[Entry Fields]
[ukboot1]
[xwebserver]
[/mydir/xweb_start]
[/mydir/xweb_stop]
[xweb]
F4=List
F8=Image
AU548.0
Notes:
The two-node cluster configuration assistant smit menu
If you have a simple two-node, hot-standby cluster to configure, the Two-Node Cluster
Configuration Assistant might be the answer. Here is the menu.
If your network is setup correctly and you have configured a shared enhanced
concurrent mode volume group, then HACMP will use this menu to build a complete
two-node cluster including Topology, Resources, Resource group and a non-IP network
using heartbeat over disk.
Also, synchronization is done and you are all ready to start cluster services on both
nodes.
The example in the visual is run from the usa node. This makes usa the home node
(highest priority) in the resource group that is created. You will have defined the boot
addresses on both usa and uk and created any shared volume groups, on both nodes.
6-53
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Seeing what was done
One utility that displays what was done is the cltopinfo command. This command
displays the clusters topology. Notice that each nodes IP labels on the ethernet
adapters have been defined on the net_ether_01 HACMP network. The non-IP diskhb
network was also configured and appears with communication devices (dev/hdisk5) on
each of the two nodes. Notice what policies you get automatically configured when
using this approach.
Another utility is cldisp. This command shows what is configured from the application
point of view.
6-55
Student Notebook
Points to observe
- The Two-Node Configuration Assistant did everything -- created topology objects
including a non-IP heartbeat over disk network when it saw an enhanced concurrent
volume group, created resource groups, and verified and synchronized the cluster.
- The Two-Node Configuration Assistant assigns names; so you will have to decide if
you like them.
- The assistant also takes for HACMP all network adapters found. You might have to
remove ones for interfaces that you dont want HACMP to have.
- Only one application and two-nodes are supported.
- You need to pre-configure the shared volume group. If it is Enhanced Concurrent
Mode then a non-IP heartbeat over disk network is configured. Else you are on your
own to configure a non-IP network.
- The Fallback policy is set to Never Fallback.
- No persistent or rs-232 non-IP network is defined.
V4.0
Student Notebook
Uempty
Synchronize
AU548.0
Notes:
Cluster configuration is implemented
Wow! All is done except for starting Cluster Services.
6-57
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
How to start HACMP Cluster Services
Starting Cluster Services involves a trip to the top-level HACMP menu because we
need to go down into the System Management (C-SPOC) part of the tree. C-SPOC will
be covered in more detail in the next unit.
It might be worth pointing out that if you use the Web-based smit for HACMP fileset,
then there is a navigation menu that allows you to skip from one menu path to another
one without having to go back to the top.
After a few times, you will probably learn to use the command smit clstart or smitty
clstart to bypass this menu and the next two menus.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The C-SPOC menu
Choose Manage HACMP Services next.
6-59
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
The Manage HACMP Services menu
Were almost there...
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
now
[usa,uk]
Automatically
true
true
false
Interactively
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Startup choices
There are a few choices to make. For the moment, we will just recommend the defaults,
except selecting both nodes and turning on the Cluster Information Daemon. The other
options are discussed in the next unit in more detail.
6-61
Student Notebook
Removing a cluster
Use Extended Topology Configuration
Configure an HACMP Cluster
Move cursor to desired item and press Enter.
Add/Change/Show an HACMP Cluster
Remove an HACMP Cluster
Reset Cluster Tunables
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
# > /usr/es/sbin/cluster/etc/rhosts
Copyright IBM Corporation 2008
AU548.0
Notes:
Starting over
If you have to start over, you can:
- Stop cluster services on all nodes.
- Use Extended Configuration, as shown above to remove the cluster (on all nodes).
- Remove the entries (but not the file) from /usr/es/sbin/cluster/etc/rhosts (on all
nodes).
If you really want to start over, then you can:
- installp -u cluster
- rm -r /usr/es/* (be very careful here)
V4.0
Student Notebook
Uempty
We're there!
We've configured a two-node cluster with multiple resource
groups, including all the steps with a :
Each resource group has a different home (primary) node
Each resource group falls back to its home node on recovery
uk
X
AU548.0
Notes:
Mutual takeover completed
Weve finished configuring a two-node HACMP cluster with two resource groups
operating in a mutual takeover configuration. The term mutual takeover derives from the
fact that each node is the home node for one resource group and provides fallover (that
is, takeover) services to the other node.
This is, without a doubt, the most common style of HACMP cluster as it provides a
reasonably economical way to protect two separate applications. It also keeps the folks
with budgetary responsibility happier because each of the systems is clearly doing
something useful all the time (many would argue that a system that is just acting as a
standby for a critical application is doing something useful but it is a lot easier to make
the case if both systems are actually running an important application at all times).
The cluster even has the mandatory non-IP network!
6-63
Student Notebook
Checkpoint
1. True or False?
It is possible to configure a recommended simple two-node cluster environment using
just the standard configuration path.
2. In which of the top-level HACMP menu choices is the menu for starting and
stopping cluster nodes?
a.Initialization and Standard Configuration
b.Extended Configuration
c.System Management (C-SPOC)
d.Problem Determination Tools
3. In which of the top-level HACMP menu choices is the menu for defining a non-IP
heartbeat network?
a.Initialization and Standard Configuration
b.Extended Configuration
c.System Management (C-SPOC)
d.Problem Determination Tools
4. True or False?
It is possible to configure HACMP faster by having someone help you on the other
node.
5. True or False?
You must specify exactly which filesystems you want mounted when you put resources
into a resource group.
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Checkpoint
1.True or False?
It is possible to configure a recommended simple two-node cluster environment
using just the standard configuration path.
2.In which of the top-level HACMP menu choices is the menu for starting and
stopping cluster nodes?
a.Initialization and Standard Configuration
b.Extended Configuration
c.System Management (C-SPOC)
d.Problem Determination Tools
3.In which of the top-level HACMP menu choices is the menu for defining a non-IP
heartbeat network?
a.Initialization and Standard Configuration
b.Extended Configuration
c.System Management (C-SPOC)
d.Problem Determination Tools
4.True or False?
It is possible to configure HACMP faster by having someone help you on the other
node.
5.True or False?
You must specify exactly which filesystems you want mounted when you put
resources into a resource group.
AU548.0
Notes:
Some notes from the developer :-)
This is a photograph of Lake Louise in the Canadian Rocky Mountains (located about a
90 minute drive west of Calgary). If you are ever there, make sure that you rent one of
the canoes in the photograph and go for a paddle out on the lake. Theres also a
number of quite spectacular and not particularly strenuous hikes that start from near the
point that this photograph was taken. The hike that goes up to the tea house is
definitely worth an afternoon (you can pay money to go up on horseback if you dont
feel like walking for free).
Also, can you read this? ;-)
Aoccdrnig to a rscheearchr at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the
ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer is at the rghit
pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is
bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe.
6-65
Student Notebook
Checkpoint solutions
1.
True or False?
It is possible to configure a recommended simple two-node cluster environment
using just the standard configuration path.
You cant create the non-IP network from the standard path.
2.
In which of the top-level HACMP menu choices is the menu for starting and
stopping cluster nodes?
a.
b.
c.
d.
3.
In which of the top-level HACMP menu choices is the menu for defining a nonIP heartbeat network?
a.
b.
c.
d.
4.
True or False?
It is possible to configure HACMP faster by having someone help you on the other
node.
5.
True or False?
You must specify exactly which filesystems you want mounted when you put
resources into a resource group.
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
7-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Use the SMIT Standard and Extended menus to make
topology and resource group changes
Describe the benefits and capabilities of C-SPOC
Perform routine administrative changes using C-SPOC
Start and stop Cluster Services
Perform resource group move operations
Discuss the benefits and capabilities of DARE
Use the snapshot facility to return to a previous cluster
configuration or to roll back changes
Configure and use Web SMIT
AU548.0
Notes:
7-2
HACMP Implementation
V4.0
Student Notebook
Uempty
7-3
Student Notebook
AU548.0
Notes:
7-4
HACMP Implementation
V4.0
Student Notebook
Uempty
usa
uk
X
AU548.0
Notes:
Introduction
Were now going to embark on a series of hypothetical scenarios to illustrate a number
of routine cluster administration tasks. Some of these scenarios are more realistic than
others.
7-5
Student Notebook
[Entry Fields]
[zwebgroup]
[usa uk] +
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
Does the order in which the node names are specified matter?
Copyright IBM Corporation 2008
AU548.0
Notes:
Add a resource group
We use the Extended path. It is configured to start up on whichever node is available
first and to never fallback when a node rejoins the cluster. The combination of these two
parameters should go a long way towards minimizing this resource groups downtime.
If youre familiar with the older terminology of cascading and rotating resource groups,
this resource groups policies make it essentially identical to a cascading without
fallback resource group.
7-6
HACMP Implementation
V4.0
Student Notebook
Uempty
+--------------------------------------------------------------------------+
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Introduction
We need to define a service IP label for the zwebgroup resource group.
7-7
Student Notebook
are no other limits on the number of resource groups with service labels that can be
configured on an IPAT via IP aliasing network (although, eventually, you run out of CPU
power or memory or something for all the applications associated with these resource
groups).
Network name
The next step is to associate this Service Label with one of the HACMP networks. This
is not shown in the visual.
7-8
HACMP Implementation
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Adding a service IP label
The visual shows the entry fields for this panel.
7-9
Student Notebook
* Server Name
* Start Script
* Stop Script
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Add an application server
You must give it a name and specify a start and stop script.
V4.0
Student Notebook
Uempty
[Entry Fields]
zwebgroup
usa uk
Startup Behavior
Fallover Behavior
Fallback Behavior
Fallback Timer Policy (empty is immediate)
Service IP Labels/Addresses
Application Servers
[zweb]
[zwebserver]
+
+
[zwebvg]
false
false
[]
fsck
+
+
+
+
+
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Filesystems (empty is ALL for VGs specified)
Filesystems Consistency Check
[MORE...17]
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Adding resources to a resource group (extended path)
This is the first of two screens to show the Extended Path menu for adding attributes.
Unlike the Standard path, it contains a listing of all the possible attributes.
7-11
Student Notebook
[Entry Fields]
fsck
sequential
false
[]
+
+
+
+
[]
[]
[]
[]
+
+
+
+
Tape Resources
Raw Disk PVIDs
[]
[]
+
+
[]
[]
+
+
[]
[]
+
+
Miscellaneous Data
WPAR Name
[]
[]
[BOTTOM]
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Adding resources to a resource group (extended path)
More choices.
New choices for HACMP 5.4.1 include the NFS V4 entries and the WPAR name.
V4.0
Student Notebook
Uempty
[Entry Fields]
[Both]
[No]
[No]
[No]
[Standard]
+
+
+
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Extended path synchronization
This is the Extended path screen to show the Synchronization menu options that are
not shown in the Standard path.
7-13
Student Notebook
usa
india
uk
X
AU548.0
Notes:
Expanding the cluster
In this scenario, well look at adding a node to a cluster.
V4.0
Student Notebook
Uempty
6.Add the new node to the existing cluster (from one of the existing
nodes)
7.Add non-IP networks for the new node
8.Synchronize your changes
9.Start Cluster Services on the new node
10.Add the new node to the appropriate resource groups
11.Synchronize your changes again
12.Run through your (updated) test plan
Copyright IBM Corporation 2008
AU548.0
Notes:
Adding a new cluster node
Adding a node to an existing cluster isnt all that difficult from the HACMP perspective
(as we see shortly). The hard work involves integrating the node into the cluster from an
AIX and from an application perspective.
Well be discussing the HACMP part of this work.
7-15
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
[ibmcluster]
[indiaboot1] +
usa uk
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Add node: Standard path
This operation and any other SMIT HACMP operations must be performed from an
existing cluster node. The india node wont become an existing cluster node until we
synchronize our changes in a few pages; so use an existing node until the cluster is
synchronized.
Cluster Name
SMIT fills this field in based on the previous value. Leave as is or change. The name
that you assign to your cluster is pretty much arbitrary. It appears in log files and the
output of commands.
New Nodes
The new nodes are specified by giving the IP label or IP address of one currently active
network interface on each node. Use F4 to generate a list, or type one resolvable IP
label or IP address for each node. If more than one node, they should be space
V4.0
Student Notebook
Uempty
separated.
This path will be taken to initiate communication with the node.
The command launched by this SMIT screen contacts the clcomd at each address and
asks them to come together in a new cluster. Obviously, HACMP must already be
installed on the new nodes.
7-17
Student Notebook
stdout: yes
stderr: no
F1=Help
F8=Image
n=Find Next
F2=Refresh
F9=Shell
F3=Cancel
F10=Exit
F6=Command
/=Find
AU548.0
Notes:
Add node: Standard path (in progress)
When the Enter key is pressed on the previous SMIT screen, HACMPs automatic
discovery process begins. When the nodes have been identified, the discovery process
retrieves the network and disk configuration information from each of the cluster nodes
and builds a description of the new cluster. The network configuration information is
used to create the initial IP network configuration.
The remainder of the output from this SMIT operation isnt particularly interesting
(unless something goes wrong); so well just ignore it for now. You will get an
opportunity to add a node in the lab exercises.
V4.0
Student Notebook
Uempty
* Node Name
Communication Path to Node
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Add node: Extended path
The Extended Path is essentially the same as the Standard Path in this case.
Be aware that at this point youve only configured the node definition. You must also
configure the adapter definitions (boot adapter definitions). To do this you use the
Extended path, Extended Topology, Communications Interfaces/Devices.
7-19
Student Notebook
# Node
Device
Device Path
Pvid
usa
tty0
/dev/tty0
uk
tty0
/dev/tty0
india
tty0
/dev/tty0
usa
tty1
/dev/tty1
uk
tty1
/dev/tty1
>
india
tty1
/dev/tty1
usa
tty2
/dev/tty2
>
uk
tty2
/dev/tty2
india
tty2
/dev/tty2
F1=Help
F2=Refresh
F3=Cancel
F7=Select
F8=Image
F10=Exit
F1 Enter=Do
/=Find
n=Find Next
F9+--------------------------------------------------------------------------+
Copyright IBM Corporation 2008
AU548.0
Notes:
Introduction
This visual, and the next one, show how to add two more non-IP networks to our cluster.
Make sure that the topology of the non-IP networks that you describe to HACMP
corresponds to the actual topology of the physical rs232 cables.
In the following notes, we discuss why we need to add two more non-IP rs-232 links.
Note that if you are using heartbeat on disk the same two steps are required. There
must be a unique disk shared between india and usa, and india and uk to define the
two heartbeat on disk networks (one between india and usa, the other between india
and uk). You cant use an hdisk on one node for a heartbeat on disk network with two
different nodes.
V4.0
Student Notebook
Uempty
Mesh configuration
The most redundant configuration would be a mesh, each node connected to every
other node. However, if you have more than three nodes, this means extra complexity
and can mean a lot of extra hardware, depending on which type of non-IP network you
are using.
Note: For a three node cluster, a ring and a mesh are the same.
Three-node example
In the example in the visual, we already have a non-IP network between usa and uk; so
we need to configure one between india and usa (on this page) and another one
between uk and india (on the next page).
If, for example, we left out the uk and india non-IP network, then the loss of the usa
node would leave the uk and india nodes without a non-IP path between them.
Five-node example
In even larger clusters, it is still necessary to configure only a ring of non-IP networks.
For example, if the nodes are A, B, C, D, and E, then five non-IP networks would be the
minimum requirement: A to B, B to C, C to D, E to F, and F to A being one possibility. Of
course, other possibilities exist, such as A to B, B to D, D to C, C to E, and E to F.
7-21
Student Notebook
# Node
Device
Device Path
Pvid
usa
tty0
/dev/tty0
uk
tty0
/dev/tty0
india
tty0
/dev/tty0
usa
tty1
/dev/tty1
uk
tty1
/dev/tty1
india
tty1
/dev/tty1
usa
tty2
/dev/tty2
>
uk
tty2
/dev/tty2
>
india
tty2
/dev/tty2
F1=Help
F2=Refresh
F3=Cancel
F7=Select
F8=Image
F10=Exit
F1 Enter=Do
/=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Define non-IP networks
Make sure that the topology of the non-IP networks that you describe to HACMP
corresponds to the actual topology of the physical rs232 cables.
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[No]
[No]
[Standard]
F3=Cancel
F7=Edit
Enter=Do
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Synchronize
At this point, all this configuration exists only on the node where the data was entered.
To populate the other nodes HACMP ODMs, you must synchronize. When weve
synchronized our changes, the india node is an official member of the cluster.
7-23
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
now
[india]
Automatically
true
false
false
Interactively
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Start Cluster Services on the new node
Now that india is an official member of the cluster, we can start Cluster Services on the
node.
This and all future SMIT HACMP operations can be performed from any of the three
cluster nodes.
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Add the node to a resource group
Remember that adding the new india node to the HACMP configuration is the easy
part. You would not perform any of the SMIT HACMP operations shown so far in this
scenario until you were CERTAIN that the india node was actually capable of running
the application.
7-25
Student Notebook
usa
uk
X
india
AU548.0
Notes:
Removing a node
In this scenario, we take a look at how to remove a node from an HACMP cluster.
V4.0
Student Notebook
Uempty
Ensure that each resource group is left with at least two nodes
AU548.0
Notes:
Removing a node
Although removing a node from a cluster is another fairly involved process, some of the
work has little, if anything, to do with HACMP.
7-27
Student Notebook
Removing an application
The zwebserver application has been causing problems and a
decision has been made to move it out of the cluster
usa
uk
X
AU548.0
Notes:
Removing an application
In this scenario, we remove a resource group.
It looks like this imaginary organization could do with a bit more long range planning.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Introduction
The procedure for removing a resource group is actually fairly straightforward.
Cluster snapshot
HACMP supports something called a cluster snapshot. This would be an excellent time
to take a cluster snapshot, just in case we decide to go back to the old configuration.
We will discuss snapshots later in this unit.
7-29
Student Notebook
the case of shared volume groups, tie up physical resources, that could presumably be
better used elsewhere.
A cluster should not have any useless resources or components because anything
that simplifies the cluster tends to improve availability by reducing the likelihood of
human error.
V4.0
Student Notebook
Uempty
xwebgroup
ywebgroup
zwebgroup
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Removing a resource group
Make sure that you delete the correct resource group.
7-31
Student Notebook
+--------------------------------------------------------------------------+
before continuing.
F1=Help
F2=Refresh
F3=Cancel
F1 F8=Image
F10=Exit
Enter=Do
F9+--------------------------------------------------------------------------+
Press Enter (if you are sure). Be sure to synchronize and run
through validation testing.
Copyright IBM Corporation 2008
AU548.0
Notes:
Are you sure?
Pause to make sure you know what you are doing. If you arent sure, its easy to go
back and step through the process again.
V4.0
Student Notebook
Uempty
2. You have decided to add a third node to your existing twonode HACMP cluster. What very important step follows
adding the node definition to the cluster configuration
(whether through Standard or Extended Path)?
a. Take a well deserved break, bragging to co-workers about
your success.
b. Install HACMP software.
c. Configure a non-IP network.
d. Start Cluster Services on the new node.
e. Add a resource group for the new node.
AU548.0
Notes:
7-33
Student Notebook
V4.0
Student Notebook
Uempty
7-35
Student Notebook
AU548.0
Notes:
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Introduction
You must develop good change management procedures for managing an HACMP
cluster. As you will see, C-SPOC utilities can be used to help, but do not do the job by
themselves. Having well documented and tested procedures to follow, as well as
restricting who can make changes (for example you should not have more than two or
three persons with root privileges) minimizes loss of availability when making changes.
The snapshot utility should be used before any change is made.
7-37
Student Notebook
Recommendations
Implement and adhere to a change control/management
process
Wherever possible, use HACMP's C-SPOC facility to make
changes to the cluster (details to follow)
Document routine operational procedures in a step-by-step list
fashion (for example, shutdown, startup, increase size of a
filesystem)
Restrict access to the root password to trained High Availability
cluster administrators
Always take a snapshot (explained later) of your existing
configuration before making a change
AU548.0
Notes:
Some beginning recommendations
These recommendations are considered to be the minimum acceptable level of cluster
administration. There are additional measures and issues that should probably be
carefully considered (for example, problem escalation procedures should be
documented, and both hardware and software support contracts should either be kept
current or a procedure developed for authorizing the purchase of time and materials
support during off hours should an emergency arise).
V4.0
Student Notebook
Uempty
As the cluster administrator you should make yourself part of every change
meeting that occurs on your HACMP systems.
Think about the implications of the change on the cluster configuration and
function, keeping in mind the networking concepts weve discussed as well as
any changes to the applications data organization or start/stop procedures.
- The onus should be on the requester of the change to demonstrate that it is
necessary; not on the cluster administrators to demonstrate that it is unwise
- Management must support the process.
Defend cluster administrators against unreasonable request or pressure.
Do not allow politics to affect a change's priority or schedule.
- Every change, even the minor ones, must follow the process.
No system, cluster, or database administrator can be allowed to sneak
changes past the process.
The notion that a change might be permitted without following the process must
be considered to be absurd.
Other recommendations
Ensure that you request sufficient time during the maintenance window for testing the
cluster. If this isnt possible, advise all parties of the risks of running without testing.
Update any documentation as soon as possible after the change is made to reflect the
new configuration or function of the cluster, if anything changes.
7-39
Student Notebook
Target
node
Initiating
node
Target
node
Target
node
Copyright IBM Corporation 2008
AU548.0
Notes:
C-SPOC command execution
C-SPOC commands first execute on the initiating node. Then the HACMP command
cl_rsh is used to propagate the command (or a similar command) to the target nodes.
More details
All the nodes in the resource group must be available or the C-SPOC command will be
performed partially across the cluster, only on the active nodes. This can lead to
7-40 HACMP Implementation
V4.0
Student Notebook
Uempty
problems later when nodes are brought up and are out of sync with the other nodes in
the cluster.
As you saw in the LVM unit, LVM changes, if made through C-SPOC, can be
synchronized automatically (for enhanced concurrent mode volume groups, and then
only the LV information, not the filesystem information).
7-41
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Top-level C-SPOC menu
The top-level C-SPOC menu is one of the four top-level HACMP menus.
C-SPOC scripts are used for Users, LVM, CLVM, and Physical Volume Management.
RGmove is used for Resource Group management.
The other functions are included here as a logical place to put these system
management facilities. We will look at Managing Cluster Services and the Logical
Volume Management tasks.
The fast path is smitty cl_admin.
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
now
[usa,uk]
Automatically
true
true
false
Interactively
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Briefly, how did we get here?
The first choice in the C-SPOC menu is Manage HACMP Services. This option brings
up another menu containing three choices: Start Cluster Services, Stop Cluster
Services, and Show Cluster Services. This menu displays when we select Start
Cluster Services. Better yet, just use the fast path, smitty clstart.
7-43
Student Notebook
You have a choice of any or all nodes in the cluster to start services. Use F4 to get a
pick list. If the field is left blank, services will be started on all nodes.
When Cluster Services is started, it wants to acquire resources in Resource Groups, if
so configured, and make applications available. Beginning with HACMP V5.4, the
function of managing resource groups can be deferred. The option to choose in that
case is Manually. To allow Cluster Services to acquire resources and make
applications available if so configured (pre-HACMP v5.4 behavior), choose the default,
Automatically.
You can broadcast a message that cluster services are being started.
You have the option to start the Client Information Daemon, clinfo, along with the start of
Cluster Services. This is usually a good idea as it allows you to use the clstat cluster
monitor utility.
Finally, there are options regarding verification. Before Cluster Services is started, a
verification is run to ensure that you are not starting a node with an inconsistent
configuration. You can choose to ignore verification errors and start anyway. This is not
something that you would do unless you are very aware of the reason for the
verification error, you understand the ramifications of starting with the error and you
must activate Cluster Services. An alternative that is safer would be to choose to
Interactively correct errors found during verification. Not all errors can be corrected,
but you have a better chance of getting cluster services activated in a clean
configuration with this option.
The options that you choose here are retained in the HACMP ODM and repopulated on
reentry.
V4.0
Student Notebook
Uempty
usa # clstat -a
clstat - HACMP Cluster Status Monitor
------------------------------------Cluster: ibmcluster (1156578448)
Wed Aug 30 11:16:19 2006
State: UP
Nodes: 2
SubState: STABLE
Node: usa
Interface: usaboot1 (2)
Interface: usaboot2 (2)
Interface: usa_hdisk5_01 (0)
Interface: xweb (2)
Resource Group: xwebgroup
Also consider using the cldump command. This relies solely on SNMP to get the
current cluster status.
Copyright IBM Corporation 2008
AU548.0
Notes:
The Three rules
Patience is key with HACMP tasks. There are many things going on under the covers
when you ask the Cluster Manager to do something. Getting the OK in SMIT does
not mean that the task has been completely performed. Its just the beginning in many
cases.
Did I mention patience?
The Cluster Manager daemon queues events. It doesnt forget (usually anyway). So
keep in mind, that if you launch a task with the Cluster Manager and dont verify its
status closely and then attempt to give the process a boost by launching another task
(such as following an rgmove with an offline) you have just queued the second task.
When the Cluster Manager completes the first task, provided that its in a state where it
can continue processing, it will perform the second task. This might not be what you
wanted.
7-45
Student Notebook
Its easy to encourage patience when writing a course. The author is extremely
impatient and rarely follows his own advice. That doesnt make it right! I have learned
the value of patience the hard way, by not being patient and paying the price.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Base Cluster Logs
The cluster.log file is a good starting point to see what events have been run. You can also
see errors and timestamps to help in navigating the hacmp.out log file. It can be said that
looking at the hacmp.out file is as much art as it is science. The more you become
comfortable with what you expect to see, the easier it will be to navigate. As you see, the
format of the entries helps you to understand what is being done, on what resource and
how long its been running.
More detailed log
You might also want to consult the clstrmgr.debug log too. This is the Cluster Manager
daemon log. It can be difficult to understand as its very detailed internal processing, but
error messages found here might be useful as well as an understanding of whether the
Cluster Manager is busy doing something even when no event processing is occurring.
7-47
Student Notebook
[Entry Fields]
now
[usa]
true
Bring Resource Groups>
+
+
+
+
+--------------------------------------------------------------------------+
Shutdown mode
F1=Help
F2=Refresh
F3=Cancel
F1 F8=Image
F10=Exit
Enter=Do
F5 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Briefly, how did we get here?
From the Manage HACMP Services C-SPOC menu, this menu displays when we
choose Stop Cluster Services. You can use the fast path, smitty clstop.
V4.0
Student Notebook
Uempty
You have a choice of any or all nodes in the cluster to stop services. Use F4 to get a
pick list. If the field is left blank, services will be stopped on all nodes.
You can broadcast a message that cluster services are being stopped.
Finally, the options regarding Resource Group management. Prior to HACMP V5.4, the
options were graceful, takeover and, forced. Graceful meant to Bring Resource Groups
Offline prior to stopping cluster services. Takeover meant to Move Resource Groups to
other available nodes, if applicable, according to the current locations and Fallover
policies of the Resource Groups. As you can see, these options map directly to the
current options and their functions are self-explanatory.
But what about forced down you say? Prior to HACMP V5.4, forcing down Cluster
Services was supported sometimes, in some scenarios, and resulted in an environment
that was potentially unstable (that is, potentially unavailable), Forcing cluster services
down when using Enhanced Concurrent Mode Volume Groups was not supported
because Group Services and gsclvmd were brought down as part of the forced down
operation. Group Services and gsclvmd are the components that maintain the volume
groups VGDA/VGSA integrity across all nodes. With HACMP V5.4 and later, forcing
down cluster services is supported by moving the resource groups to an Unmanaged
state. In addition, the cluster manager and the RSCT infrastructure remain active
permitting this action with Enhanced Concurrent Mode Volume Groups; thus, the option
in the menu shown, Unmanage Resource Groups. While in this state, the cluster
manager remains in the ST_STABLE state. It doesnt die gracefully and respawn as
stated earlier and doesnt return to the ST_INIT state. This allows the Cluster Manager
to participate in cluster activities and keep track of changes that occur in the cluster.
As with starting cluster services, the options that you choose here are retained in the
HACMP ODM and repopulated on reentry.
7-49
Student Notebook
uk # clstat -a
clstat - HACMP Cluster Status Monitor
------------------------------------Cluster: ibmcluster (1156578448)
Wed Aug 30 10:44:20 2006
State: UP
Nodes: 2
SubState: STABLE
Node: usa
Interface: usaboot1 (2)
192.168.15.29
Interface: usaboot2 (2)
192.168.16.29
State: DOWN
Address:
State: DOWN
Address:
1. patience
2. patience
3. patience
State: DOWN
AU548.0
Notes:
Stopping cluster services without going to unmanaged
This means youve chosen to stop cluster services either with the Bring Resource
Groups Offline or Move Resource Groups option. In other words, its not a forced
down.
As with starting cluster services, remember that patience is essential. Many tasks are
performed behind the scenes when you ask the Cluster Manager to do something.
Getting the OK in SMIT does not mean that the task has been completely performed.
Its just the beginning in many cases.
Did I mention patience?
V4.0
Student Notebook
Uempty
Although you might find the output to be unreliable at times, the clstat utility is a good
mechanism to use. Note that it was run on another system, not the one where cluster
services was stopped. If youre not a fan of clstat, consider using cldump, which relies
on SNMP directly.
Another option is to use lssrc. This is to be used with caution. You must understand
what state is expected and then be patient, retrying the command to ensure that the
state changes are no longer occurring. A state of ST_INIT is the indication that Cluster
Services has stopped on this node. This is the resulting state from a respawn of the
Cluster Manager daemon. As you will see in the next visual, stopping Cluster Services
with Unmanaged Resource Groups leaves the Cluster Manager daemon in
ST_STABLE. Know what state to expect.
Finally, although not shown (due to lack of space on the visual), another option is to use
WebSMIT. This is the solution for those of you who want to see a graphical
representation of cluster status. You will see more of that later in this unit.
7-51
Student Notebook
Node: usa
Interface: usaboot1 (2)
State: UP
Address: 192.168.15.29
State: UP
Address: 192.168.5.92
State: UP
State: Unmanaged
AU548.0
Notes:
Stopping cluster services with unmanaged resource groups
This means youve chosen to force down cluster services.
One more time, remember that patience is essential. Did I mention that getting the OK
in SMIT does not mean that the task has been completely performed. Its just the
beginning in many cases.
Did I mention patience?
V4.0
Student Notebook
Uempty
Again, the clstat utility can be a good mechanism to use. Note that it was run on another
system, not the one where cluster services was stopped. Notice that the resource group
shows online. This is valid. It also shows the state as Unmanaged. You only stopped
Cluster Services, not the resources.
The quickest way to see that there are unmanaged resources is to use clRGinfo. Note
that is shows the state of the resource group as Unmanaged on both nodes. In fact, it
will show Unmanaged on any node where that resource group can acquired if this is not
an online on all nodes startup policy resource group. It will show Unmanaged only on
the node where Cluster Services was stopped if the resource group startup policy is
online on all nodes.
As in the previous slides on verifying the state, another option is to use WebSMIT. This
is the solution for those of you who want to see a graphical representation of cluster
status. You will see more of that later in this unit.
7-53
Student Notebook
F2=Refresh
F10=Exit
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Introduction
This is the menu for using C-SPOC to perform LVM change management and
synchronization. As was mentioned in the LVM unit, you can make changes in AIX
directly and then synchronize or, if you can make the changes using C-SPOC utilities,
where the synchronization is automatic.
V4.0
Student Notebook
Uempty
How it works
When you create a shared volume group, you must rerun the discovery mechanism
(refer to top-level menu in the enhanced configuration path) to get HACMP to know
about the volume group. You must then add the volume group to a resource group
before you can use C-SPOC to add shared logical volumes or filesystems.
Synchronization
Note that you only need to add the volume group to a resource group using SMIT from
one of the cluster nodes, and then you can start working with C-SPOC from the same
node. You do not need to synchronize the cluster between adding the volume group to a
resource group and working with it using C-SPOC unless you want to use C-SPOC
from some other node. Remember that the volume group is not really a part of the
resource group until you synchronize the addition of the volume group to the resource
group.
7-55
Student Notebook
Node Names
PVID
VOLUME GROUP name
Physical partition SIZE in megabytes
Volume group MAJOR NUMBER
Enhanced Concurrent Mode
Enable Cross-Site LVM Mirroring Verification
[Entry Fields]
usa,uk
00055207bbf6edab 0000>
[xwebvg]
64
[207]
true
false
+
#
+
+
Warning :
Changing the volume group major number may result
in the command being unable to execute
successfully on a node that does not have the
major number currently available. Please check
for a commonly available major number on all nodes
before changing this setting.
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Creating a shared volume group
You can use C-SPOC to create a volume group but be aware that you must then add
the volume group name to a resource group and synchronize. This is one case of using
C-SPOC where synchronization is not automatic.
Before creating a shared volume group for the cluster using C-SPOC check that:
- All disk devices are properly attached to the cluster nodes
- All disk devices are properly configured on all cluster nodes and the device is listed
as available on all nodes
- Disks have a PVID
(C-SPOC lists the disks by their PVIDs. This ensures that we are using the same
disk on all nodes, even if the hdisk names are not consistent across the nodes).
This menu was reached through the Concurrent Logical Volume Management option
on the main C-SPOC menu.
7-56 HACMP Implementation
V4.0
Student Notebook
Uempty
F1=Help
Esc+9=Shell
F2=Refresh
Esc+0=Exit
F3=Cancel
Enter=Do
Esc+8=Image
AU548.0
Notes:
Discover and add VG to resource group
After creating a volume group, you must discover it so that the new volume group will be
available in pick lists for future actions, such as adding it to a resource group, and so
forth.
You must use the Extended Configuration menu for both of these actions.
7-57
Student Notebook
F2=Refresh
F7=Edit
F10=Exit
[Entry Fields]
xwebgroup
xwebvg
usa
[200]
[xweblv]
[jfs2]
middle
minimum
[]
+
+
#
F3=Cancel
F8=Image
Enter=Do
F4=List
The volume group must be in a resource group that is online; otherwise, it does not display in the pop-up
list.
Copyright IBM Corporation 2008
Figure 7-42. Creating a shared file system (1 of 2)
AU548.0
Notes:
Creating a shared file system using C-SPOC
It is generally preferable to control the names of all of your logical volumes.
Consequently, it is generally best to explicitly create a logical volume for the file system.
If the volume group does not already have a JFS2 log (unless you plan to use inline
logs, then the jfs2log wont be needed), then you must also explicitly create a logical
volume for the JFS log and format it with logform. The same can be said if you are
creating a JFS filesystem.
The volume group to which you want to add the filesystem must be online. Your choice,
either varyonvg the volume group manually, or via starting cluster services.
However, C-SPOC enables you to add a journaled file system to either:
- A shared volume group (no previously defined cluster logical volume)
SMIT checks the list of nodes that can own the resource group that contains the
volume group, creates the logical volume (on an existing log logical volume if
7-58 HACMP Implementation
V4.0
Student Notebook
Uempty
present; otherwise, it creates a new log logical volume) and adds the file system to
the node where the volume group is varied on (whether it was varied on by the
C-SPOC utility or it was already online). All other nodes in the resource group run an
importvg -L for non-enhanced concurrent mode volume groups, or an imfs for
enhanced concurrent mode volume groups.
- A previously defined cluster logical volume (in a shared volume group)
SMIT checks the list of nodes that can own the resource group that contains the
volume group where the logical volume is located. It adds the file system to the node
where the volume group is varied on (whether it was varied on by the C-SPOC utility
or it was already online). All other nodes in the resource group run an importvg -L
for non-enhanced concurrent mode volume groups, or an imfs for enhanced
concurrent mode volume groups.
7-59
Student Notebook
Node Names
LOGICAL VOLUME name
* MOUNT POINT
PERMISSIONS
Mount OPTIONS
Block Size (bytes)
Inline Log?
Inline Log size (MBytes)
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
#
F4=List
F8=Image
AU548.0
Notes:
Creating a shared file system, step 2
When youve created the logical volume, then create a file system on it.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
The importance of LVM change management
LVM change management is critical for successful takeover in the event of a node
failure.
Information regarding LVM constructs is held in a number of different locations:
- Physical disks: VGDA, LVCB
- AIX files: primarily the ODM, but also /usr/sbin/cluster/etc/vg, files in the /dev
directory and /etc/filesystems
- Physical RAM: kernel memory space
This information must be kept in sync on all nodes that might access the shared volume
group or groups in order for takeover to work.
7-61
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
After making a change to an LVM component, such as creating a new logical volume and
file system as shown in the figure, you must propagate the change to the other nodes in the
cluster that are sharing the volume group using the steps described. Make sure that the
auto activate is turned off (chvg -an sharedvg) after the importvg command is executed
because the cluster manager will control the use of the varyonvg command on the node
where the volume group should be varied on.
Other than the sheer complexity of this procedure, the real problem with it is that it requires
that the resource group be down while the procedure is being carried out.
Fortunately, there are better ways...
7-63
Student Notebook
11 12 1
10
2
3
4
7 6 5
11 12 1
10
2
3
4
7 6 5
9
8
9
8
AU548.0
Notes:
HACMP has a facility called Lazy Update that it uses to attempt to synchronize LVM
changes during a fallover.
HACMP uses a copy of the timestamp kept in the ODM and a timestamp from the volume
groups VGDA. AIX updates the ODM timestamp whenever the LVM component is
modified on that system. When a cluster node attempts to vary on the volume group,
HACMP for AIX compares the timestamp from the ODM with the timestamp in the VGDA
on the disk (use /usr/es/sbin/cluster/utilities/clvgdata hdiskn to find the VGDA timestamp for
a volume group). If the values are different, the HACMP for AIX software exports and
re-imports the volume group before activating it. If the timestamps are the same, HACMP
for AIX activates the volume group without exporting and re-importing. The time needed for
takeover expands by a few minutes if a Lazy Update occurs.
This method requires no downtime; although, as indicated, it does increase the fallover
time minimally for the first fallover after the LVM change was made.
V4.0
Student Notebook
Uempty
Realize though that this mechanism will not fix every situation where nodes are
out-of-sync. Further, having the takeover process fix problems with the LVM meta-data at
takeover time is not the preferred method of handling the synchronization.
To preserve permissions/ownership over an import, the volume group must be a Big or
Scalable VG and the logical volumes must be modified using chlv with the -U (for user id),
-G for group id, -P (for permissions) flags. The importvg must be done with a -R.
7-65
Student Notebook
update vg constructs
use C-SPOC syncvg
AU548.0
Notes:
Using C-SPOC to synchronize manual LVM changes
In this method, you manually make your change to the LVM on one node and then
invoke C-SPOC to propagate the change. Most likely the reason you are using this
C-SPOC task is because someone who is unfamiliar with cluster node management
made a change to a shared LVM component without using C-SPOC, creating an
out-of-sync condition between a node in the cluster and the rest of the nodes. This task
allows you to use C-SPOC to clean-up after-the-fact.
Note: If using an enhanced concurrent mode volume group and a filesystem has been
added to an existing logical volume without using C-SPOC, the imfs is not done
meaning this is an ineffective function. For this reason (among many others), you are
strongly encouraged to use C-SPOC to perform the LVM add/remove/update and not
use this mechanism to synchronize after-the-fact.
V4.0
Student Notebook
Uempty
7-67
Student Notebook
Benefits
Fast Disk Takeover
Can convert existing VGs to ECMVGs via C-SPOC
Limitations
Incomplete
/etc/filesystems not updated
Incompatible
Must be careful using ECMVGs if any product that is running on the system
places SCSI reserves on the disks as part of its function
Copyright IBM Corporation 2008
AU548.0
Notes:
RSCT as LVM change management
With enhanced concurrent mode (ECM) volume groups, RSCT will automatically
update the ODM on all the nodes that share the volume group when an LVM change
occurs on one node.
However, because it is limited to only ECM volume groups and because
/etc/filesystems is not updated, its better to explicitly use C-SPOC to make LVM
changes.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
You can use C-SPOC to both make the change and to distribute the change.
This approach has two major advantages: no downtime is required and you can be
confident that the nodes are in sync. It might take a little longer to run than the normal chfs
application, but it is well worth the wait.
Other C-SPOC screens exist for pretty much any operation that you are likely to want to do
with a shared volume group.
7-69
Student Notebook
+--------------------------------------------------------------------------+
# Resource Group
File System
xwebgroup
/xwebfs
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Changing a shared file system using C-SPOC
We have to provide the name of the file system that we want to change. The file system
must be in a volume group that is currently online somewhere in the cluster and is
already configured into a resource group.
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
xwebgroup
/xwebfs
[/xwebfs]
[4000000]
[]
read/write
[]
no
4096
no
0
F3=Cancel
F7=Edit
Enter=Do
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Changing file system size
Specify a new file system size, in 512 byte blocks, and press Enter. The file system is
re-sized and the relevant LVM information is updated on all cluster nodes configured to
use the file systems volume group.
7-71
Student Notebook
F1=Help
Esc+9=Shell
F2=Refresh
Esc+0=Exit
F3=Cancel
Enter=Do
Esc+8=Image
AU548.0
Notes:
HACMP resource group and application management
This visual shows the selections for managing resource groups.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Priority override location (old) problem behavior
Problem behavior is in the following levels:
- Before HACMP V5.3 PTF IY84883 May 2006
- Before HACMP V5.2 PTF IY82989 April 2006
- Before HACMP V5.1 PTF IY84646 May 2006
HACMP 5.x introduced the notion of a priority override location. A priority override
location overrides all other fallover and fallback policies and possible locations for the
resource group.
A resource group does not normally have a priority override location (POL). The
destination node that you specify for a resource group move, online or offline request
(see next couple of visuals) becomes the priority override location for the resource
group. The resource group remains on that node in an online state (if you moved or
on-lined it there) or offline state (if you off-lined it there) until the priority override location
is cancelled.
Copyright IBM Corp. 1998, 2008
7-73
Student Notebook
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Priority override location: Problems solved
New behavior is in the following levels and later:
- HACMP V5.3 PTF IY84883 May 2006
- HACMP V5.2 PTF IY82989 April 2006
- HACMP V5.1 PTF IY84646 May 2006
Prior to HACMP 5.4 but with the above mentioned PTFs or later, the problem where the
resource group moved on RestoreNodePriority regardless of Fallback Policy settings
was fixed. Now the RestoreNodePriority only resets the POL setting, unless the
Fallback Policy is fallback to highest priority node. In that case, the behavior is the
same as the old way.
For HACMP 5.4 and later, the function is strictly internal and the Resource Group Move
operation is treated as temporary. If more permanent changes are desired, make the
changes in the Resource Group. The original highest priority node is flagged in SMIT
when subsequent resource group moves are initiated.
Copyright IBM Corp. 1998, 2008
7-75
Student Notebook
*usa
uk
india
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9
AU548.0
Notes:
Moving a resource group
Prior to the SMIT panel shown, the resource group must be chosen from a list of online
resource groups. You can request that a resource group be moved to any node that is in
the resource groups list of nodes (where cluster services are active).
The clRGmove utility program is used, which can also be invoked from the command
line. See the man page for details.
The destination node that you specify becomes the resource groups priority override
location.
V4.0
Student Notebook
Uempty
For HACMP 5.3 and earlier, if Persist Across Cluster Reboot is set to true, then the
priority override location will be persistent. Otherwise, it will be non-persistent.
7-77
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
This screen follows; press enter to move the xwebgroup to the uk node.
V4.0
Student Notebook
Uempty
# Resource Group
State
Node(s) / Site
xwebgroup
ONLINE
uk
/
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9
AU548.0
Notes:
Bring a resource group offline: Select a resource group
To start, you must select the resource group you wish to take offline. Then youll select
an online node where you want the resource group brought offline. This is pretty
obvious for a resource group that will only be active on one node at a time (OHNO or
OFAN). For resource groups that can be online on more than one node at once (Online
on All Available), you can choose All or just one of the active nodes.
7-79
Student Notebook
uk
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9-
AU548.0
Notes:
Now choose the node where the resource group will be taken offline.
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
[Entry Fields]
xwebgroup
uk
F4=List
F8=Image
The option to persist across cluster reboot is available prior to HACMP V5.4
Copyright IBM Corporation 2008
AU548.0
Notes:
Bring a resource group offline
When a resource group is brought offline on a node, all resources will be deactivated on
that node.
7-81
Student Notebook
usa
uk
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9
AU548.0
Notes:
Bring a resource group online
First youll choose an offline Resource Group. Then the option above will display with the
potential nodes on which to bring it online.
Bringing a resource group online will activate the resources in it on the target node.
Again, watch for the cluster to go stable and verify that the resources are available on the
intended target node.
V4.0
Student Notebook
Uempty
/var/hacmp/log/cspoc.log*
/var/hacmp/clverify/clverify.log
/var/hacmp/log/emuhacmp.out*
/var/hacmp/log/hacmp.out*
/var/hacmp/log/hacmp.out.<1-7>*
AIX error log
/var/ha/log/topsvcs
/var/ha/log/grpsvcs
/var/hacmp/log/clstrmgr.debug*
/var/hacmp/clcomd/clcomd.log
/var/hacmp/clcomd/clcomddiag.log
/var/hacmp/log/clavan.log
/var/hacmp/log/
clconfigassist.log
clutils.log
cl_testtool.log
AU548.0
Notes:
Log files
The visual summarizes the HACMP log files.
7-83
Student Notebook
More in Unit 10
Figure 7-62. Log files generated by HAMCP - HACMP 5.4.1 and later
AU548.0
Notes:
When installed from scratch, HACMP 5.4.1 will use /var/hacmp/log as the default for all log
files.
You can view the current settings through SMIT using the HACMP Log Viewing and
Management path.
Of course, if you install on top of an existing configuration, or apply a snapshot, your
settings will be preserved; however, if you want to redirect all log files there is a new SMIT
path that enables you to redirect them all at once.
HACMP uses korn shell scripts to perform recovery operations. An effort was made in
HACMP 5.4.1 to clean up these scripts and consolidate the use of things like VERBOSE
LOGGING, set x and the PS4 settings. This produces more consistent results in
hacmp.out and makes it easier to read and follow.
Similarly for key components such as clcomd and clver, the logging was made more
consistent.
V4.0
Student Notebook
Uempty
The clsnap command was also updated to collect everything needed at the same time
rather than multiple commands and multiple options.
7-85
Student Notebook
True or False?
Using C-SPOC reduces the likelihood of an outage by reducing the
likelihood that you will make a mistake.
2.
True or False?
3.
4.
True or False?
It does not matter which node in the cluster is used to initiate a C-SPOC
operation.
5.
AU548.0
Notes:
V4.0
Student Notebook
Uempty
7-87
Student Notebook
AU548.0
Notes:
Dynamic Automatic Reconfiguration Event
In this topic, we examine HACMPs capability to perform changes to the cluster
configuration while the cluster is running. This capability is known as Dynamic
Automatic Reconfiguration Event, or DARE for short.
V4.0
Student Notebook
Uempty
Dynamic reconfiguration
HACMP provides a facility that allows changes to cluster
topology and resources to be made while the cluster is active.
This facility is known as DARE.
DARE requires three copies of the HACMP ODM.
DCD
SCD
rootvg
ACD
AU548.0
Notes:
How it works
Dynamic Reconfiguration is made possible by the fact that HACMP holds three copies
of the ODM, known as the Default, Staging, and Active configuration directory. By
holding three copies of the ODM, HACMP can make changes on one node and
propagate them to other nodes in the cluster while an active configuration is currently
being used.
7-89
Student Notebook
Topology Changes
Adding or removing cluster nodes
Adding or removing networks
Adding or removing communication interfaces or
devices
Swapping a communication interface's IP address
Resource Changes
All resources can be changed
Copyright IBM Corporation 2008
AU548.0
Notes:
What can DARE do?
The visual shows some of the changes that can be made dynamically using DARE.
V4.0
Student Notebook
Uempty
DARE cannot run if two nodes are not at the same HACMP level
Copyright IBM Corporation 2008
AU548.0
Notes:
Limitations
Some changes require a restart of Cluster Services.
Also, DARE requires that all nodes are at the same HACMP level.
7-91
Student Notebook
change topology
synchronize topology snapshot taken of cluster manager reads SCD is deleted
ACD and refreshes
or resources in SMIT or resources in SMIT the current ACD
HACMP
HACMP
Cluster Configuration
Cluster Configuration
Cluster Services
Cluster System Management
Cluster Recovery Aids
Cluster Services
Cluster System Management
Cluster Recovery Aids
RAS Support
RASfdsfsfsafsafsfs
fsafsfdsafdsafdsafdsfsdafsdadafsdafsdf
Support
SCD
F1=Help
F2=Refresh
F3=Cancel
Esc+9=Shell
Esc+0=Exit
Enter=Do
Esc+8=Image
F1=Help
F2=Refresh
F3=Cancel
Esc+9=Shell
Esc+0=Exit
Enter=Do
Esc+8=Image
Type
text
DCD
SCD
SCD
ACD
ACD
SCD
AU548.0
Notes:
How it works
DARE uses three copies of the HACMP ODM to propagate live updates to the cluster
topology or resource configuration across the cluster. This is done in five steps detailed
above. Although it is possible to make a nearly arbitrarily large set of changes to the
configuration and then synchronize them all in one operation, it is usually better to make
a modest change, synchronize it, verify that it works, and then move on to more
changes.
Note that many changes are incompatible with the clusters current AIX configuration.
Such changes are, therefore, not possible to synchronize using DARE. Instead, the
cluster has to be taken down while the appropriate AIX configuration changes are
applied. (It is sometimes possible to remove some resources from a resource group,
synchronize, change the AIX configuration of the resources, add them back into the
resource group, and synchronize again; although, there is likely to be little point in
running the resource group without the resources).
7-92 HACMP Implementation
V4.0
Student Notebook
Uempty
HACMP 5.x synchronizes both topology changes and resource changes whenever it is
run. This is a change from previous releases of HACMP.
7-93
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Verifying and synchronizing (standard)
This visual highlights the Verify and Synchronize HACMP Configuration menu entry
in the top-level Standard Configuration paths SMIT menu.
Invoking this menu entry initiates an immediate verification and synchronization of the
HACMP configuration from the local nodes DCD (there is no opportunity provided to
modify the process in any way).
V4.0
Student Notebook
Uempty
[Entry Fields]
[Both]
+
[No]
+
[No]
[No]
[Standard]
+
+
+
F2=Refresh
F6=Command
F10=Exit
[Actual]
[No]
[Standard]
F3=Cancel
F7=Edit
Enter=Do
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Verifying and synchronizing (extended)
When the Extended Verification and Synchronization option in the extended
configuration paths top-level menu is selected, the SMIT screen above displays. It
allows the cluster administrator to modify the default verification and synchronization
procedure somewhat.
Emulate or actual
The default of Actual causes the changes being verified and synchronized to take
effect (become the actual cluster configuration) if the verification succeeds. Setting this
field to Emulate causes HACMP to verify and then go through the motions of a
synchronize without actually causing the changes to take effect. This is useful to get a
sense of what side effects the synchronization is likely to result in. For example, if the
proposed change would trigger a fallover or a fallback (because node priorities have
7-95
Student Notebook
Logging
This field can be set to Standard to request the default level of logging or to Verbose to
request a more, ummmm, verbose level of logging! If you are having problems getting a
change to verify and do not understand why it will not verify, then setting the logging
level to Verbose might provide additional information.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Rolling back an unwanted change that has not yet been synchronized
If you have made changes that you have decided to not synchronize, they can be
discarded using the Restore HACMP Configuration Database from Active
Configuration menu entry shown above. It is located under the Problem
Determination Tools menu (accessible from the top-level HACMP SMIT menu).
Prior to rolling back the DCD on all nodes, the current contents of the DCD on the node
used to initiate the roll back is saved as a snapshot (in case they should prove useful in
the future). The snapshot will have a rather long name similar to:
Restored_From_ACD.Sep.18.19.33.58
This name can be interpreted to indicate that the snapshot was taken at 19:33:58 on
September 18th (the year is not preserved in the name).
Because the change being discarded is sometimes a change that has been emulated,
this operation is sometimes called rolling back an emulated change. This is a misnomer
Copyright IBM Corp. 1998, 2008
7-97
Student Notebook
as the operation rolls back any change that has not yet been verified and synchronized
by restoring all nodes DCDs to the contents of the currently active cluster configuration.
V4.0
Student Notebook
Uempty
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
jami
Cuz -- he did the lab>
[Yes]
+
[No]
+
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Rolling back an unwanted change that has been synchronized
If you find that a DARE change does not give the desired result, then you might want to
roll it back. DARE cuts a snapshot of the active configuration immediately prior to
committing HACMP configuration. This snapshot is named active.x.odm (where x is
0...9, 0 being the most recent). It can be used to restore the cluster to an earlier state.
7-99
Student Notebook
V4.0
Student Notebook
Uempty
change topology
synchronize topology snapshot taken of cluster manager reads SCD is deleted
ACD and refreshes
or resources in SMIT or resources in SMIT the current ACD
HACMP
HACMP
Cluster Configuration
Cluster Configuration
Cluster Services
Cluster System Management
Cluster Recovery Aids
Cluster Services
Cluster System Management
Cluster Recovery Aids
RAS Support
RASfdsfsfsafsafsfs
fsafsfdsafdsafdsafdsfsdafsdadafsdafsdf
Support
SCD
F1=Help
F2=Refresh
F3=Cancel
Esc+9=Shell
Esc+0=Exit
Enter=Do
Esc+8=Image
F1=Help
F2=Refresh
F3=Cancel
Esc+9=Shell
Esc+0=Exit
Enter=Do
Esc+8=Image
Type
text
Bang!
DCD
SCD
SCD
ACD
ACD
SCD
AU548.0
Notes:
What if DARE fails?
If a node failure should occur while a synchronization is taking place, then the Staging
Configuration Directory (SCD) was not cleared on all nodes. The presence of the SCD
prevents further configuration changes from being performed. If the SCD is not cleared
at the end of a synchronize, then this indicates that the DARE operation did not
complete or was not successful; and hence the SCD acts as a lock against further
changes being made.
Note that the SCD copies are made before the change is copied by each nodes cluster
manager into each nodes ACD. If there is an SCD when Cluster Services starts up on a
node, it copies it to the ACD, deletes the SCD and uses the new ACD as its
configuration. Because a node failure at any point after any of the SCDs exists could
result in only some of the nodes having the updated SCD, the SCDs must be removed
before a restart of Cluster Services on any node (or you risk different cluster nodes
7-101
Student Notebook
running with different configurations, a situation that results in one or more cluster
nodes crashing).
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Clearing dynamic reconfiguration locks
The SMIT menu option Release Locks Set By Dynamic Reconfiguration clears out the
SCD and allows further synchronizations to be made to the cluster configuration. If an
SCD exists on any cluster node, then no further synchronizations are permitted until it is
deleted using the above SMIT menu option.
7-103
Student Notebook
3. True or False?
It is possible to roll back from a successful DARE operation using an
automatically generated snapshot.
4. True or False?
Running a DARE operation requires three separate copies of the
HACMP ODM.
5. True or False?
Cluster snapshots can be applied while the cluster is running.
6. What is the purpose of the dynamic reconfiguration lock?
a. To prevent unauthorized access to DARE functions
b. To prevent further changes being made until a DARE operation has
completed
c. To keep a copy of the previous configuration for easy rollback
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
7.4 WebSMIT
7-105
Student Notebook
Implementing WebSMIT
After completing this topic, you should be able to:
Configure and use WebSMIT
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Web-enabled SMIT
HACMP 5.2 and up includes a web-enabled user interface that
provides easy access to:
HACMP configuration and management functions
Interactive cluster status display and manipulation
HACMP online documentation
AU548.0
Notes:
Introduction
WebSMIT combines the advantages of SMIT with the ease of access from any system
that runs a browser.
For those looking for a graphical interface for managing and monitoring HACMP,
WebSMIT provides those capabilities via a Web browser. It provides real-time graphical
status of the cluster components, similar to the clstat.cgi. It also provides context menu
access to those components to control by launching a WebSMIT menu containing the
action or actions to take. There are multiple views, Node-by-node, Resource Group,
Associations, component Details, and so on.
Configuration
This utility uses snmp; so it is imperative that you have your snmp interface to the
cluster manager functioning. To test that, attempt a cldump command on the system
Copyright IBM Corp. 1998, 2008
7-107
Student Notebook
where you will be running the WebSMIT utility. A configuration utility is provided
(websmit_config) requiring that only a supported HTTP server be installed to configure
the system for use as a WebSMIT server. A robust control tool is provided as well to
control the HTTP server functioning. The tool is called websmitctl. Check it out in lab.
Features
- Off-line/Unavailable status is displayed as grayed out.
- Most WebSMIT items can be assigned a custom color set.
- Auto-configuration improvements.
- Language support is more sophisticated.
- An instant help system.
- Resource-type awareness in the display.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Introduction
To connect to WebSMIT, point your browser to the cluster node that you have
configured for WebSMIT.
WebSMIT uses port 42267 by default.
After authentication, this will be the first screen that you see. Note the Navigation Frame
(left side) and the Activity Frame (right side). Also, note that were looking at
configuration options only. Each pane is tabulated to provide access to different status,
functions or controls.
Navigation Frame tabs:
- SMIT - access to HACMP SMIT
- N&N - a Node-by-node relationship and status view of the cluster (if snmp can get
cluster information)
7-109
Student Notebook
- RGs - a Resource Group relationship and status view of the cluster status
Use the Expand All or Collapse All links to get the full view or clean up the view.
Activity Frame tabs:
- Configuration - permanent access to HACMP SMIT from Activity Frame
- Details - comes to top when a component is selected in Navigation Frame, and
displays configuration information about the component
- Associations - shows component relationship to other HACMP components for
component that is selected in the Navigation Frame
- Doc - If the HACMP pubs were installed (html or pdf version), this tab will display
links to access them
Dont attempt to navigate using the browsers Back or Forward buttons. Note the
FastPath box at the bottom of the Configuration tab. This allows you to go directly to
any (that is any) SMIT panel if you know the fastpath. Whats the fastpath to the SMIT
top menu?
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Using the context menus
Right-click the object in the Navigation Frame. Choose the item you want to control from
the context menu and watch the Activity Frame change to the task youre trying to
perform. Remember this is still SMIT, so youll get HACMP SMIT menus as a result of
the context menu selections.
Status
Notice that the icons (on the screen anyway) indicate online (not grayed out) or offline
(grayed out). This is real-time status. More to come on the next visual, regarding the
associations.
7-111
Student Notebook
WebSMIT associations
AU548.0
Notes:
Associations
If you dont click fast enough (or just pause long enough) between selecting the
Resource Group and clicking the Associations tab, youll see the Details tab come to
the top of the Activity Frame with the configuration details of the Resource Group.
V4.0
Student Notebook
Uempty
7-113
Student Notebook
AU548.0
Notes:
Online documentation
This screen enables you to view the HACMP manuals in either HTML or PDF format.
You must install the HACMP documentation file sets.
V4.0
Student Notebook
Uempty
WebSMIT configuration
Base Directory is /usr/es/sbin/cluster/wsm
Consult the documentation
Readme located at /usr/es/sbin/cluster/wsm/README
Manuals installed from cluster.doc.en_US.es.html and cluster.doc.en_US.es.pdf
AU548.0
Notes:
Documentation
The primary source for information on configuring WebSMIT is the WebSMIT README
file as shown in the visual. The HACMP Planning and Installation Guide provides some
additional information on installation and the HACMP Administration Guide provides
information on using WebSMIT.
Web server
To use WebSMIT, you must configure one (or more) of your cluster nodes as a Web
server. You must use either IBM HTTP Server (IBMIHS) V6.0 (or later) or Apache 1.3
(or later). Refer to the specific documentation for the Web server you choose.
This configuration is done using the websmit_config utility, located in
/usr/es/sbin/cluster/wsm. See the README file for details.
7-115
Student Notebook
WebSMIT security
Because WebSMIT gives you root access to all the nodes in your cluster, you must
carefully consider the security implications.
WebSMIT uses a configuration file, wsm_smit.conf, that contains settings for
WebSMIT's security related features. This file is installed as
/usr/es/sbin/cluster/wsm/wsm_smit.conf, and it may not be moved to another location.
The default settings used provide the highest level of security in the default AIX/Apache
environment. However, you should carefully consider the security characteristics of your
system before putting WebSMIT to use. You might be able to use different combinations
of security settings for AIX, Apache, and WebSMIT to improve the security of the
application in your environment.
WebSMIT uses the following configurable mechanisms to implement a secure
environment:
-
Non-standard port
Secure http (https)
User authentication
Session time-out
wsm_cmd_exec setuid program
V4.0
Student Notebook
Uempty
WebSMIT, the user will be required to provide AIX authentication information before
gaining access.
(Refer to the documentation included with Apache for more details about Apache's
built-in authentication.)
The default value for REQUIRE_AUTHENTICATION is 1. If REQUIRE_AUTHENTICATION is
set, then the HACMP administrator must specify one or more users who are allowed to
access the system. This can be done using the wsm_smit.conf ACCEPTED_USERS
setting. Only users whose names are specified will be allowed access to WebSMIT, and
all ACCEPTED_USERS will be provided with root access to the system. By default, only the
root user is allowed access via the ACCEPTED_USERS setting.
Warning
Because AIX authentication mechanisms are in use, login failures can cause an account to
be locked. It is recommended that a separate user be created for the sole purpose of
accessing WebSMIT. If the root user has a login failure limit, failed WebSMIT login attempts
could quickly lock the root account.
Session time-out
Continued access to WebSMIT is controlled through the use of a non-persistent session
cookie. Cookies must be enabled in the client browser in order to use AIX
authentication for access control. If the session is used continuously, then the cookie
will not expire. However, the cookie is designed to time out after an extended period of
inactivity. WebSMIT allows the user to adjust the time-out period using the
wsm_smit.conf SESSION_TIMEOUT setting. This configuration setting must have a value
expressed in minutes. The default value for SESSION_TIMEOUT is 20 (minutes).
Controlling access to wsm_cmd_exec (setuid)
A setuid program is supplied with WebSMIT that allows non-root users to execute
commands with root permissions (wsm_cmd_exec). The setuid bit for this program must
be turned on in order for the WebSMIT system to function.
For security reasons, wsm_cmd_exec must not have read permission for non-root users.
Do not allow a non-root user to copy the executable to another location or to
decompile the program.
Thus the utility wsm_cmd_exec (located in /usr/es/sbin/cluster/wsm/cgi-bin/) must be
set with 4511 permissions.
See the README for details.
Care must be taken to limit access to this executable. WebSMIT allows the user to
dictate the list of users who are allowed to use the wsm_cmd_exec program using the
wsm_smit.conf REQUIRED_WEBSERVER_UID setting. The real user ID of the process
must match the UID of one of the users listed in wsm_smit.conf for the program to
Copyright IBM Corp. 1998, 2008
7-117
Student Notebook
carry out any of its functionality. The default value for REQUIRED_WEBSERVER_UID is
nobody.
By default, a Web server CGI process runs as user nobody, and by default, non-root
users cannot execute programs as user nobody. If your HTTP server configuration
executes CGI programs as a different user, it is important to ensure that the
REQUIRED_WEBSERVER_UID value matches the configuration of your Web server. It is
strongly recommended that the HTTP server be configured to run CGI programs as a
user who is not authorized to open a login shell (as with user nobody).
Log files
All operations of the WebSMIT interface are logged to the wsm_smit.log file and are
equivalent to the logging done with smitty -v. Script commands are also captured in
the wsm_smit.script log file.
WebSMIT log files are created by the CGI scripts using a relative path of <../logs>. If
you copy the CGI scripts to the default location for the IBM HTTP Server, the final path
to the logs is /usr/HTTPServer/logs.
The WebSMIT logs are not subject to manipulation by the HACMP Log Viewing and
Management SMIT panel. Also, just like smit.log and smit.script, the files grow
indefinitely.
The snap -e utility captures the WebSMIT log files if you leave them in the default
location (/usr/es/sbin/cluster/wsm/logs); but if you install WebSMIT somewhere else,
snap -e will not find them.
V4.0
Student Notebook
Uempty
- wsm_smit.deny
Entering a SMIT panel ID in this file will cause WebSMIT to deny access to that
panel. If the same SMIT panel ID is stored in both the .allow and .deny files, .deny
processing takes precedence.
- wsm_smit.redirect
Instead of simply rejecting access to a specific page, you can redirect the user to a
different page. The default .redirect file has entries to redirect the user from specific
HACMP SMIT panels that are not supported by WebSMIT.
7-119
Student Notebook
Checkpoint
1.
2.
3.
4.
5.
True or False?
A star configuration is a good choice for your non-IP networks.
True or False?
Using DARE, you can change from IPAT via aliasing to IPAT via
replacement without stopping the cluster.
True or False?
RSCT will automatically update /etc/filesystems when using enhanced
concurrent mode volume groups
True or False?
With HACMP V5.4, a resource groups priority override location can be
cancelled by selecting a destination node of
Restore_Node_Priority_Order.
You want to create an Enhanced Concurrent Mode Volume Group that
will be used in a Resource Group that will have an Online on Home
Node Startup policy. Which C-SPOC menu should you use?
a. HACMP Logical Volume Management
b. HACMP Concurrent Logical Volume Management
6.
You want to add a logical volume to the volume group you created in the
question above. Which C-SPOC menu should you use?
a. HACMP Logical Volume Management
b. HACMP Concurrent Logical Volume Management
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Key points from this unit:
Implementing procedures for change management is a critical part of
administering an HACMP cluster
C-SPOC provides facilities for performing common cluster-wide
administration tasks from any node within the cluster:
The SMIT Standard and Extended menus are used to make topology
and resource group changes
The Dynamic Automatic Reconfiguration Event facility (DARE)
provides the mechanism to make changes to cluster topology and
resources without stopping the cluster
The Cluster Snapshot facility allows the user to save and restore a
cluster configuration
WebSMIT provides access to HACMP SMIT menus from any system
with a Web browser
Copyright IBM Corporation 2008
AU548.0
7-121
Student Notebook
V4.0
Student Notebook
Uempty
Unit 8. Events
What this unit is about
This unit describes the event process in HACMP.
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
Unit 8. Events
8-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Describe what an HACMP event is
Describe the sequence of events when:
The first node starts in a cluster
A new node joins an existing cluster
A node leaves a cluster voluntarily
AU548.0
Notes:
8-2
HACMP Implementation
V4.0
Student Notebook
Uempty
Unit 8. Events
8-3
Student Notebook
AU548.0
Notes:
8-4
HACMP Implementation
V4.0
Student Notebook
Uempty
node_up
node_down
fail_interface
join_interface
rg_move
reconfig_topology_start
AU548.0
Notes:
What the term HACMP event means
The term HACMP event has two contexts:
- An incident that is of interest to the cluster, such as the failure of a node or the
recovery of a NIC
- A script that is used by HACMP to actually deal with one of these incidents
Unfortunately, it is not all that uncommon for the script word to be left off in a discussion
of event scripts. Fortunately, which meaning is appropriate is almost certainly obvious
from the context of the discussion.
Unit 8. Events
8-5
Student Notebook
Recovery Command
Recovery Command
__
__
__
Event Script
HACMP Rules
ODM
Group Services/ES
Topology Services/ES
AU548.0
Notes:
How an event script is triggered
Most HACMP events result from the detection and diagnostic capabilities of RSCTs
Topology Services component. They arrive at the Cluster Manager, which then uses
recovery programs to determine which event scripts to call to actually deal with the
event. The coordination of and sequencing of the recovery programs is actually handled
by the Cluster Manager working with RSCT group services. The rules for how these
recovery programs should be coordinated and sequenced are described in the HACMP
Rules ODM file.
The RMC subsystem is used for implementing User-defined Events, Application
Monitoring, Dynamic Node Priority, and DLPAR. Dynamic Node Priority is one of the
fallover policies and DLPAR refers to the Dynamic LPAR capability of HACMP.
8-6
HACMP Implementation
V4.0
Student Notebook
Uempty
Recovery programs
cluster_notify.rp
external_resource_state_change.rp
external_resource_state_change_complete.rp
fail_interface.rp
fail_standby.rp
join_interface.rp
join_standby.rp
migrate.rp
network_down.rp
network_up.rp
node_down.rp
node_down_dependency.rp
node_down_dependency_complete.rp
node_up.rp
node_up_dependency.rp
node_up_dependency_complete.rp
reconfig_configuration.rp
reconfig_configuration_dependency_acquire.rp
reconfig_configuration_dependency_complete.rp
reconfig_configuration_dependency_release.rp
reconfig_resource.rp
reconfig_topology.rp
resource_state_change.rp
resource_state_change_complete.rp
rg_move.rp
rg_offline.rp
rg_online.rp
server_down.rp
server_restart.rp
site_down.rp
site_isolation.rp
site_merge.rp
site_up.rp
swap_adapter.rp
AU548.0
Notes:
Recovery programs
This visual lists the recovery programs that are used by the resource manager
component of the Cluster Manager Services to determine what event scripts to invoke.
These form the first step in processing an event.
Unit 8. Events
8-7
Student Notebook
AU548.0
Notes:
Format of a recovery program
The first type of line contains where the event script should run and what the name of
the script is.
The second type of line is the word barrier. This is a wait, which is handled by group
services so that other nodes can complete their processing before the next step of this
recovery program.
8-8
HACMP Implementation
V4.0
Student Notebook
Uempty
Event scripts
(called by cluster manager)
node_up_local, remote
site_up, site_up_complete,down,down_complet
node_down_local, remote
site_merge, site_merge_complete
node_up_local_complete
node_up, node_up_complete, down,
node_up_remote_complete
down_complete
node_down_local_complete
network_up, network_up_complete, down,
node_down_remote_complete
down_complete
acquire_aconn_service
swap_adapter, swap_adapter_complete
acquire_service_addr
swap_address, swap_address_complete
acquire_takeover_addr
fail_standby, join_standby
start_server, stop_server
fail_interface, join_interface
get_disk_vg_fs
rg_move, rg_move_complete
get_aconn_rs
rg_online, rg_offline
release_service_addr, takeover_addr
event_error
release_vg_fs, aconn_rs
config_too_long
swap_aconn_protocols
reconfig_topology_start, complete
releasing, acquiring
reconfig_resource_release, acquire, complete
rg_up, down, error
reconfig_configuration_dependency_acquire
rg_temp_error_state
reconfig_configuration_dependency_complete
rg_acquiring_secondary
reconfig_configuration_dependency_release
rg_up_secondary
node_up_dependency, complete
rg_error_secondary
node_down_dependency, complete
resume_appmon
migrate, migrate_complete
suspend_appmon
external _resource_state_change
server_down, server_restart
Copyright IBM Corporation 2008
Figure 8-7. Event scripts
AU548.0
Notes:
Event scripts
This is the list of HACMP events that are managed by HACMP.
The events on the left are directly called by the cluster manager or process_resources
in response to unexpected happenings. The events on the right are invoked by primary
or other secondary events on an as-needed basis.
Each of these events can have an optional notify command, one or more pre-event
scripts, one or more post-event scripts and an optional recovery command associated
with it.
Unit 8. Events
8-9
Student Notebook
process_resources
Cluster Manager
process_resources
clrgpa
RGPA
Cluster Status?
next task
Resource
Manager
Update RM
cl_RMupdate
Update RM
Exit
AU548.0
Notes:
Script process_resources
The script process_resources handles the calls from event scripts to the Resource
Group Policy Administrator (RGPA):
- Loops through each returned task (JOB_TYPE):
Calls cl_RMupdate as required to update the Cluster Manager with the status
change
Processes the next JOB_TYPE that the RGPA passes (via clrgpa) until all tasks
in the list are completed.
- There is one JOB_TYPE for each resource type. Some can be run once each event,
useful for parallel processing of resources
This is meant to show you that the process_resources script is responsible for
interacting with the event scripts. You will see the JOB_TYPE in the /tmp/hacmp.out log
file.
8-10 HACMP Implementation
V4.0
Student Notebook
Uempty
Start
Cluste
r
servic
es
1) node_up
ca
lls
RC
clstrmgrES
Event
Manager
cal
RC
ls
process_resources (NONE)
for each RG:
process_resources (ACQUIRE)
process_resources (SERVICE_LABELS)
acquire_service_addr
acquire_aconn_service en0 net_ether_01
process_resources (DISKS)
process_resources (VGS)
process_resources (LOGREDO)
process_resources (FILESYSTEMS)
process_resources (SYNC_VGS)
process_resources (TELINIT)
process_resources (NONE)
< Event Summary >
2) node_up_complete
for each RG:
process resources (APPLICATIONS)
start_server app01
process_resources (ONLINE)
process_resources (NONE)
< Event Summary >
Copyright IBM Corporation 2008
AU548.0
Notes:
Startup processing
Implicit in this example is the assumption that there is actually a resource group to start
on the node. If there are no resource groups to start on the node, then node_up_local
and node_up_local_complete do very little processing at all.
Unit 8. Events
8-11
Student Notebook
g
nin
n
u
r
clstrmgrES
t
ar
S t st e r s
e
u
Cl rvic
c a se
ll
clstrmgrES
Event
Manager
Event
Manager
Messages
ca
C
R
ll
RC
RC
process_resources (NONE)
or
process_resources (release)
ll
ca
call
RC
1) node_up
3) node_up_complete
for each RG:
process_resources (SYNC_VGS)
process_resources (NONE)
< Event Summary >
Copyright IBM Corporation 2008
2) node_up
Same sequence as
node 1 up (previous visual)
4) node_up_complete
for each RG:
process resources (APPLICATIONS)
start_server app02
process_resources (ONLINE)
process_resources (NONE)
< Event Summary >
AU548.0
Notes:
Another node joins the cluster
When another node starts up, it must first join the cluster. After that, the determination is
made to move an already active resource group to the new node (this is the assumption
in this visual). If that is the case, node_up processing on the old node 1) must
inactivate the resource group before node_up processing on the new node 2) can
acquire and activate the resource group.
V4.0
Student Notebook
Uempty
n
run
ing
clstrmgrES
p
Sto ter
s
Clu vices
ca
ll ser1) node_down takeover
clstrmgrES
Event
Manager
Event
Manager
Messages
ll
ca C
R
RC
ll
ca
RC
Same sequence
as node up
ll
ca
3) node_down takeover
RC
4) node_down_complete
for each RG:
process_resources (APPLICATIONS)
start_server app02
process_resources (ONLINE)
< Event Summary >
2) node_down_complete
process_resources (OFFLINE)
process_resources (SYNC_VGS)
< Event Summary >
AU548.0
Notes:
Node down processing normal with takeover
Implicit in this example is the assumption that there is actually a resource group on the
departing node which must be moved to one of the remaining nodes.
Node failure
The situation is only slightly different if the node on the right had failed suddenly.
Because it is not in a position to run any events, the calls to process_resources listed
under the right hand node do not get run.
Unit 8. Events
8-13
Student Notebook
Lets review
1.
Which of the following are examples of primary HACMP events (select all that
apply)?
a.
b.
c.
d.
e.
2.
node_up
node_up_local
node_up_complete
start_server
Rg_up
When a node joins an existing cluster, what is the correct sequence for these
events?
a.
b.
c.
d.
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit 8. Events
8-15
Student Notebook
AU548.0
Notes:
In this topic, we examine how to customize events in HACMP.
V4.0
Student Notebook
Uempty
Event Manager
clcallev
Recovery
Command
HACMP Event
HACMP Event
RC=0
ODM
HACMP
Classes
No
Yes
Counter
>0
Yes
No
Event Error
Notify Command
Copyright IBM Corporation 2008
AU548.0
Notes:
Event processing without customization
When a decision is made to run a particular HACMP event script on a particular node,
the above event processing logic takes control. If no event-related cluster customization
has been done on the cluster, then the HACMP Event itself (in other words, the HACMP
Event Script), is run and whether it works is noted. (If it worked, then everyone is happy;
if not then you better go look at the Problem Determination unit, which is coming up
later in the week.)
Events are logged in the /var/hacmp/adm/cluster.log file and the
/<log_dir>/hacmp.out file.
Unit 8. Events
8-17
Student Notebook
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Path to smit menu
smitty hacmp -> Extended Configuration -> Extended Event Configuration
Unit 8. Events
8-19
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
[stop_printq]
[stop the print queues]
[/usr/local/cluster/events/stop_printq]
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Path to smit menu
smitty hacmp -> Extended Configuration -> Extended Event Configuration ->
Configure Pre/Post-Event Commands -> Add a Custom Cluster Event
V4.0
Student Notebook
Uempty
Script considerations
HACMP does not develop the script content for you, neither does it synchronize the
script content between cluster nodes (indeed the content can be different on each
node). The only requirements that HACMP imposes are that the script must exist on
each node in a local (non-shared) location, be executable and have the same path and
name on every node.
Of course, an additional requirement is that the script perform as required under all
circumstances!
In HACMP 5.2 and later there is a file collections feature if you wish to have your
changes kept in sync.
Unit 8. Events
8-21
Student Notebook
node_down
Description
* Event Command
[/usr/es/sbin/cluster/>
Notify Command
Pre-event Command
Post-event Command
Recovery Command
* Recovery Counter
[]
[]
[stop_printq]
[]
[0]
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
+
+
#
F4=List
F8=Image
AU548.0
Notes:
The path to the menu
smitty hacmp -> Extended Configuration -> Extended Event Configuration ->
Change/Show Pre-Defined HACMP Events -> node_down
V4.0
Student Notebook
Uempty
Recovery commands
If an event script fails to exit 0, Recovery Commands can be
executed
Recovery
Command
HACMP Event
RC=0
No
Counter
>0
Yes
No
AU548.0
Notes:
Recovery command event customization
Recovery commands are another customization that can be made to recover from the
failure of an HACMP event script.
Unit 8. Events
8-23
Student Notebook
start_server
Description
* Event Command
[/usr/es/sbin/cluster/>
Notify Command
Pre-event Command
Post-event Command
Recovery Command
Recovery Counter
[]
[]
+
[]
+
[/usr/local/bin/recover]
[3]
#
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Recovery command menu
Here we see an example of a recovery command being added to the start_server event
script. This can handle an incorrect application start up.
Recovery commands do not execute unless the recovery counter is > 0.
V4.0
Student Notebook
Uempty
Points to note
The execute bit must be set on all pre-, post-, notify, and
recovery scripts.
Synchronization does not copy pre- and post-event script
content from one node to another.
You need to copy all your pre- and post-event scripts to all
nodes.
Your pre- and post-event scripts must handle non-zero exit
codes.
All scripts must declare the shell they will run in, such as:
#!/bin/ksh
Test your changes very carefully because a mistake is likely to
cause a fallover to abort.
AU548.0
Notes:
Test your changes
Without a doubt, the most important point to note is the last one: test your changes very
carefully. An error in a pre-, post- or recovery script/command generally becomes
apparent during a fallover; in other words, at a point in time when you can least afford it
to happen!
Unit 8. Events
8-25
Student Notebook
NIC failures
Applications
Communication Links
Volume groups
AU548.0
Notes:
Selective fallover logic
In general, the following scenarios and utilities can lead HACMP to selectively move an
affected resource group, using the Selective Fallover logic:
- In cases of service IP label failures, Topology Services, that monitors the health of
the service IP labels, starts a network_down event. This causes the selective
fallover of the affected resource group.
- In cases of application failures, the application monitor informs the ClusterManager
about the failure of the application, which causes the selective fallover of the
affected resource group.
- In cases of WAN Connections failures, the Cluster Manager monitors the status of
the SNA links and captures some of the types of SNA link failures. If an SNA link
failure is detected, the selective fallover utility moves the affected resource group.
V4.0
Student Notebook
Uempty
- In cases of volume group failures, the occurrence of the AIX error label
LVM_SA_QUORCLOSE indicates that a volume group went off-line on a node in the
cluster. This causes the selective fallover of the affected resource group.
Remember that in each case when HACMP uses Selective Fallover, an rg_move event
is launched as a response to a resource failure. You can recognize that HACMP uses
Selective Fallover when you identify that an rg_move event is run in the cluster.
Unit 8. Events
8-27
Student Notebook
Disk adapters
CPU
AU548.0
Notes:
Dealing with other failures detected by AIX
Remember that HACMP natively only monitors nodes, networks, and network adapters
by default. If you wish to monitor other devices, you can use error notification methods.
Error notification is a facility of AIX, that allows the administrator to map an entry in the
AIX error log to a command to execute.
HACMP provides a smit menu to simplify the process.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Menu path
smitty hacmp -> Problem Determination Tools -> HACMP Error Notification
Unit 8. Events
8-29
Student Notebook
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Removing automatic error notify methods
In HACMP 5.3 and later, the Automatic Error Notify Methods are automatically added;
therefore, you can come here to remove them, but it is not recommended. If you do then
after synchronization you would have to come back here to remove them again.
rt1s1vlp5:
rt1s1vlp6:
rt1s1vlp6:
This output highlights the fact that the virtual SCSI adapters are not recognized.
8-30 HACMP Implementation
V4.0
Student Notebook
Uempty
stdout: yes
stderr: no
F2=Refresh
F9=Shell
F3=Cancel
F10=Exit
F6=Command
/=Find
AU548.0
Notes:
Listing the automatic event notification methods
Heres the full output from this screen for a sample cluster:
bondar:
bondar:
bondar:
bondar:
bondar:
bondar:
bondar:
bondar:
bondar:
bondar:
hudson:
HACMP Resource
hdisk0
scsi0
hdisk11
hdisk5
hdisk9
hdisk7
ssa0
Unit 8. Events
8-31
Student Notebook
hudson:
hudson:
hudson:
hudson:
hudson:
hudson:
hudson:
hudson:
hudson:
HACMP Resource
hdisk0
scsi0
hdisk10
hdisk4
hdisk8
hdisk6
ssa0
V4.0
Student Notebook
Uempty
stdout: yes
stderr: no
HACMP Resource
hdisk0
hdisk1
/usr/es/sbin/cluster/diag/cl_failover
/usr/es/sbin/cluster/diag/cl_logerror
HACMP Resource
hdisk0
hdisk1
F1=Help
F8=Image
n=Find Next
/usr/es/sbin/cluster/diag/cl_failover
/usr/es/sbin/cluster/diag/cl_logerror
F2=Refresh
F9=Shell
F3=Cancel
F10=Exit
F6=Command
/=Find
AU548.0
Notes:
We already saw that there were errors when running the automatic error notification setup
on HACMP nodes that have only virtual I/O resources. Here we see that it will cover the
disks, but the adapters are not protected. Should you cover them? Probably not, because
theyre virtual.
Unit 8. Events
8-33
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
+
+#
+
+
+
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Menu path
smitty hacmp -> Problem Determination Tools -> HACMP Error Notification ->
Add a Notify Method
V4.0
Student Notebook
Uempty
en_resource = ""
en_rtype = ""
en_rclass = ""
en_symptom = ""
en_err64 = ""
The last line is the command to execute
en_dup = ""
en_method = "/usr/lib/ras/notifymeth -l $1 -t CHECKSTOP"
Unit 8. Events
8-35
Student Notebook
Emulating errors (1 of 2)
HACMP Error Notification
Mo+--------------------------------------------------------------------------+
|
Error Label to Emulate
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
| [TOP]
|
|
LVM_SA_QUORCLOSE
rootvg
|
|
LVM_SA_QUORCLOSE
xwebvg
|
|
FIRMWARE_EVENT
diagela_FIRM
|
|
PLAT_DUMP_ERR
diagela_PDE
|
|
SERVICE_EVENT
diagela_SE
|
|
INTRPPC_ERR
diagela_SPUR
|
|
FCP_ARRAY_ERR6
fcparray_err
|
|
FCS_ERR10
fcs_err10
|
|
DISK_ARRAY_ERR2
ha_hdisk0_0
|
|
DISK_ARRAY_ERR3
ha_hdisk0_1
|
|
DISK_ARRAY_ERR5
ha_hdisk0_2
|
| [MORE...12]
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Menu path
smitty hacmp -> Problem Determination Tools -> HACMP Error Notification ->
Emulate Error Log Entry
V4.0
Student Notebook
Uempty
it would be best to cause the actual hardware error that is of concern to verify that the
error notification method has been associated with the correct AIX error label.
Note that the emulated error does not have the same resource name as an actual
record, but otherwise passes the same arguments to the method as the actual one.
Unit 8. Events
8-37
Student Notebook
Emulating errors (2 of 2)
Emulate Error Log Entry
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
LVM_SA_QUORCLOSE
xwebvg
/usr/es/sbin/cluster/>
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Kicking off the emulation
Use this screen to start the emulation process.
V4.0
Student Notebook
Uempty
Description
QUORUM LOST, VOLUME GROUP CLOSING
Probable Causes
PHYSICAL VOLUME UNAVAILABLE
Detail Data
MAJOR/MINOR DEVICE NUMBER
00C9 0000
QUORUM COUNT
0
ACTIVE COUNT
0
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
AU548.0
Notes:
Example emulated error record
Here is an example of the output produced by running such an emulated event. The top
of the screen is the truncated output of the error template associated with the
LVM_SA_QUORCLOSE error, which gives a brief indication of the nature of the error.
The output of an emulation will have the value Resource Name: EMULATE. If you are
depending on this field, you have a problem testing. You might have to change your
command to execute while testing via emulation.
Unit 8. Events
8-39
Student Notebook
Checkpoint
1. Which of the following runs if an HACMP event script fails?
(select all that apply)
a.Pre-event scripts
b.Post-event scripts
c.Error notification methods
d.Recovery commands
e.Notify methods
3. True or False?
Pre-event scripts are automatically synchronized.
4. True or False?
Writing error notification methods is a normal part of
configuring a cluster.
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Having completed this unit, you should be able to:
Describe what an HACMP event is
Describe the sequence of events when:
The first node starts in a cluster
A new node joins an existing cluster
A node leaves a cluster voluntarily
AU548.0
Notes:
Unit 8. Events
8-41
Student Notebook
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
9-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Explain the concepts of NFS
Configure HACMP to support NFS
Discuss why Volume Group major numbers must be unique
when using NFS with HACMP
Outline the NFS configuration parameters for HACMP
AU548.0
Notes:
Objectives
In this unit, we examine how NFS can be integrated in to HACMP to provide a Highly
Available Network File System.
9-2
HACMP Implementation
V4.0
Student Notebook
Uempty
NFS Server
read-write
NFS mount
read-only
JFS mount
read-only
NFS mount
AU548.0
Notes:
NFS
NFS is a suite of protocols that allow file sharing across an IP network. An NFS server
is a provider of file service (that is, a file, a directory or a file system). An NFS client is a
recipient of a remote file service. A system can be both an NFS client and server at the
same time.
9-3
Student Notebook
NFS Server
n x nfsd and mountd
n x biod
/etc/exports
/etc/filesystems
AU548.0
Notes:
NFS processes
The NFS server uses a process called mountd to allow remote clients to mount a local
disk or CD resource across the network. One or more nfsd processes handle I/O on the
server side of the relationship.
The NFS client uses the mount command to establish a mount to a remote storage
resource which is offered for export by the NFS server. One or more block I/O
daemons, biod, run on the client to handle I/O on the client side.
The server maintains details of data resources offered to clients in the /etc/exports file.
Clients can automatically mount network file systems using the /etc/filesystems file.
9-4
HACMP Implementation
V4.0
Student Notebook
Uempty
# mount aservice:/fsa /a
The A resource group specifies:
aservice as a service IP label resource
/fsa as a filesystem resource (by
default as part of a volume group)
/fsa as an NFS filesystem to export
aservice
export /fsa
A
/fsa
# mount /fsa
uk
usa
Copyright IBM Corporation 2008
AU548.0
Notes:
Combining NFS with HACMP
We can combine NFS with HACMP to achieve a Highly Available Network File System.
One node in the cluster mounts the disk resource locally and offers that disk resource
for export across the IP network. Clients optionally mount the disk resource. A second
node is configured to take over the NFS export in the event of node failure.
There is one unusual aspect to the above configuration, which should be discussed.
The HACMP cluster is exporting the /fsa file system via the aservice service IP label.
The client is mounting the aservice:/fsa file system on the local mount point /a. This
is somewhat unusual in the sense that client systems usually use a local mount point
which is the same as the NFS file systems name on the server.
In the configuration shown above, there is no particularly good reason why the client is
using a different mount point than /fsa and, in fact, the client is free to use whatever
mount point is wishes to use including, of course, /fsa. Why this example is using a
local mount point of /a will become clear shortly.
Copyright IBM Corp. 1998, 2008
9-5
Student Notebook
# mount aservice:/fsa /a
client system "sees" /fsa as /a
export /fsa
aservice
/fsa
# mount /fsa
uk
usa
Copyright IBM Corporation 2008
AU548.0
Notes:
Fallover
If the node offering the NFS export should fail, a standby node takes over the shared
disk resource, locally mounts the file system, and exports the file system or directory for
remote mount.
If the client was not accessing the disk resource during the period of the fallover, then it
is not aware of the change in which node is serving the NFS export.
Note that the aservice service IP label is in the resource group, which is exporting
/fsa. The HACMP NFS server support requires that resource groups that export NFS
filesystems be configured to use IPAT because the client system is not capable of
dealing with two different IP addresses for its NFS server, depending on which node the
NFS server service happens to be running on.
9-6
HACMP Implementation
V4.0
Student Notebook
Uempty
[Entry Fields]
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Filesystems (empty is ALL for VGs specified)
Filesystems Consistency Check
Filesystems Recovery Method
[aaavg]
false
false
[]
fsck
sequential
+
+
+
+
+
+
true
[/fsa]
+
+
[]
[]
[]
+
+
[MORE...13]
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Configuring NFS for high availability
The visual shows the resource group attributes that are important for configuring an
NFS file system.
- Filesystems/Directories to Export
Specifies the filesystems to be NFS exported.
- Filesystems mounted before IP configured
When implementing NFS support in HACMP, you should also set this option. This
prevents access from a client before the filesystems are ready.
- Filesystem (empty is ALL for VGs specified)
This particular example also explicitly lists the /fsa filesystem as a resource to be
included in the resource group (see the Filesystem (empty is ALL for VGs specified)
field). This is not necessary because this field could have been left blank to indicate
9-7
Student Notebook
that all the filesystems in the aaavg volume group should be treated as resources
within the resource group.
9-8
HACMP Implementation
V4.0
Student Notebook
Uempty
aservice
/a
/fsa
/a
AU548.0
Notes:
Cross-mounting
We can use HACMP to mount an NFS exported filesystem locally on all the nodes
within the cluster. This allows two or more nodes to have access to the same disk
resource in parallel. An example of such a configuration might be a shared repository
for the product manuals (read only) or a shared /home filesystem (read-write). One
node mounts the filesystem locally, then exports the filesystem. All nodes within the
resource group then NFS mount the filesystem.
By having all nodes in the resource group act as an NFS client, including the node that
holds the resource group, it is not necessary for the takeover node to unmount the
filesystem before becoming the NFS server.
9-9
Student Notebook
V4.0
Student Notebook
Uempty
aservice
/a
/fsa
/a
AU548.0
Notes:
Fallover with a cross-mounted file system
If the left-hand node fails then HACMP on the right hand node initiates a fallover of the
resource group. This primarily consists of:
- Assigning or aliasing (depending on which flavor of IPAT is being used) the
aservice service IP label to a NIC
- Varying on the shared volume group and mounting the /fsa journaled filesystem
- NFS exporting the /fsa filesystem
Note that the right hand node already has the aservice:/fsa filesystem NFS mounted
on /a.
9-11
Student Notebook
aservice
export /fsa
/fsa
# mount /fsa
# mount aservice:/fsa /a
usa
# mount aservice:/fsa
/a
uk
AU548.0
Notes:
Cross-mounting details
The key change, compared to the configuration that did not use cross-mounting, is that
this configurations resource group lists /fsa as an NFS filesystem and specifies that it
is to be mounted on /a. This causes every node in the resource group to act as an NFS
client with aservice:/fsa mounted at /a. Only the node that actually has the resource
group is acting as an NFS server for the /fsa filesystem.
V4.0
Student Notebook
Uempty
aGservice
aservice
export /fsa
/fsa
# mount /fsa
# mount aservice:/fsa /a
usa
# mount aservice:/fsa /a
uk
AU548.0
Notes:
Network for NFS mount
HACMP allows you to specify which network should be used for NFS exports from this
resource group.
In this scenario, we have an NFS cross-mount within a cluster that has two IP networks.
For some reason, probably that the net_ether_01 network is either a faster networking
technology or under a lighter load, the cluster administrator has decided to force the
cross-mount traffic to flow over the net_ether_01 network.
This field is relevant only if you have filled in the Filesystems/Directories to NFS
Mount field. The Service IP Labels/IP Addresses field should contain a service label
which is on the network you select.
If the network you have specified is unavailable when the node is attempting to NFS
mount, it will seek other defined, available IP networks in the cluster on which to
establish the NFS mount.
Copyright IBM Corp. 1998, 2008
9-13
Student Notebook
[Entry Fields]
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
[aaavg]
false
false
+
+
+
[]
fsck
sequential
true
[/fsa]
+
+
+
+
+
[]
[]
+
+
[/a;/fsa]
+
[net_ether_01]+
[MORE...12]
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Configuring HACMP for cross-mounting
The directory or directories to be cross-mounted are specified in the
Filesystems/Directories to NFS Mount field. The network to be used for NFS cross-mounts
is optionally specified in the Network for NFS Mount field.
Cross-mount syntax
Note the rather strange /a;/fsa syntax for specifying the directory to be
cross-mounted. This rather unusual syntax is explained in the next foil.
Note that the resource group must include a service IP label, which is on the
net_ether_01 network (aservice in the previous foil).
V4.0
Student Notebook
Uempty
/a;/fsa
# mount aservice:/fsa /a
AU548.0
Notes:
Syntax for specifying cross-mounts
The inclusion of a semi-colon in the Filesystems/Directories to NFS Mount field
indicates that the newer (and easier to work with) approach to NFS cross-mounting
described in this unit is in effect. The local mount point to be used by all the nodes in the
resource group when they act as NFS clients is specified before the semi-colon. The
NFS filesystem which they are to NFS mount is specified after the semi-colon.
Because the configuration specified in the last HACMP smit screen uses net_ether_01
for cross-mounts and the service IP label on the net_ether_01 network is aservice
(see the diagram a couple of foils back showing the two IP networks), each node in the
resource group will mount aservice:/fsa on their local /a mount point directory.
9-15
Student Notebook
system
system
system
201,
203,
205,
The command lvlstmajor will list the available major numbers for each node in the cluster
For example:
# lvlstmajor
43...200,202,206...
The VG major number may be set at the time of creating the VG using SMIT mkvg or by using the
-V flag on the importvg command, for example:
# importvg -V100 -y shared_vg_a hdisk2
C-SPOC will "suggest" a VG major number which is unique across the nodes
when it is used to create a shared volume group
AU548.0
Notes:
VG major numbers
Volume group major numbers must be the same for any given volume group across all
nodes in the cluster. This is a requirement for any volume group that has filesystems
which are NFS exported to clients (either within or without the cluster).
V4.0
Student Notebook
Uempty
AU548.0
Notes:
HACMP exports file
As mentioned in the visual, if you need to specify NFS options, you must use the
HACMP exports file, not the standard AIX exports file. You can use AIX smit mknfsexp
to build the HACMP exports file:
Add a Directory to Exports List
* Pathname of directory to export
Anonymous UID
[]
[-2]
Public filesystem?
* Export directory now, system restart or both
Pathname of alternate exports file
no
both
[/usr/es/sbin/cluster/etc/exports]
.....
9-17
Student Notebook
Checkpoint
1. True or False?
HACMP supports all NFS export configuration options.
4. True or False?
HACMP's NFS exporting feature supports only clusters of two nodes.
5. True or False?
IPAT is required in resource groups that export NFS filesystems.
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Key points from this unit:
HACMP provides a means to make Network File System (NFS) highly
available
Configure Filesystem/Directory to Export and Filesystems
mounted before IP started in resource group
VG major number must be the same on all nodes
Clients NFS mount using service address
In case of node failure, takeover node acquires the service address,
acquires the disk resource, mounts the file system and NFS exports the
file system
Clients see NFS server not responding during the fallover
AU548.0
Notes:
9-19
Student Notebook
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
10-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
List reasons why HACMP can fail
Identify configuration and administration errors
List the problem determination tools available in smit
Explain why the Dead Man's Switch invokes
Explain when the System Resource Controller kills a node
Isolate and recover from failed event scripts
Correctly escalate a problem to IBM support
AU548.0
Notes:
In this unit we examine some of the reasons why HACMP might fail, and how to perform
basic problem determination to recover from failure.
V4.0
Student Notebook
Uempty
X
A
uk
usa
AU548.0
Notes:
Root causes
Often the root cause of problems with HACMP is the absence of design and planning at
the outset, or poor design and planning. As you will have now figured out, a couple of
hours spent in planning HACMP reaps rewards later on in terms of how easy it is to
configure, administer, and diagnose problems with the cluster.
HACMP verifies all topology and resource configuration parameters and most IP
configuration parameters before synchronization takes place. This means that provided
the cluster synchronizes and starts successfully, the cluster should remain stable.
The prime reason for cluster failure when the environment is in production is
administrative mistakes and an absence of change control.
Typically, HACMP clusters are very stable. During the writing of this course, a customer
complained to IBM that his HACMP cluster had failed on him because a node had failed
and his workload did not get taken over by the standby node. Upon investigation it was
Copyright IBM Corp. 1998, 2008
10-3
Student Notebook
proven that in fact an earlier (undetected) failure had resulted in the standby node
taking over the workload and a subsequent component failure resulted in a second
point of failure. How many points of failure does HACMP handle?
V4.0
Student Notebook
Uempty
Test Item
How to test
Checked
Node Fallover
Network Adapter Swap
IP Network Failure
Storage Adapter Failure
Disk Failure
clstrmgr daemon Killed
Serial Network Failure
Disk Adapter for rootvg Failure
Application Failure
Node re-integration
Partitioned Cluster
Copyright IBM Corporation 2008
AU548.0
Notes:
Importance of testing
Every cluster should be thoroughly tested before going live. It is important that you
develop and document a cluster test plan for your environment. Start by taking your
cluster diagram and highlighting all the things that could go wrong, then write down
what you expect the cluster to do in response to that failure. Periodically, test your
cluster to ensure that fallover works correctly and correct your test plan if your
assumptions about what will happen differ from that which HACMP actually performs
(for example, shutdown -F does not cause fallover). HACMP 5.2 and later provides a
test tool, which will be discussed later in this unit.
Although it is recommended that testing of the cluster services be performed using
Move Resource Groups, it is especially important to conduct this testing if HACMP is to
be used to reduce Planned Downtime (for upgrades/maintenance) as this will be the
cluster function that will be used. This method of testing, however, should not replace
10-5
Student Notebook
the testing of a node failure due to crash (for example, halt -q or just stop the LPAR at
the HMC).
All efforts should be made to verify application functions (user level testing) as the
cluster function tests are being performed. Verifying that the cluster functions correctly
without verifying that the application functions correctly as part of the cluster function
test is not recommended. Getting the end-user commitment is sometimes the hardest
part of this process.
Use of emulation
You can emulate some common cluster status change events. Remember that
whenever you make a change to cluster configuration, test the change before putting
the cluster back into production if at all possible.
You should always emulate a DARE change before actually doing it. If a DARE change
does not succeed during emulation, then it will definitely not succeed when you actually
do it.
V4.0
Student Notebook
Uempty
mount
lsfs
netstat -i
no -a
lsdev
lsvg [<ecmvg>]
lsvg -o
lslv
lspv
ifconfig
clRGinfo
cltopinfo
clcheck_server
clstat
AU548.0
Notes:
Some key tools
Some of the key tools to aid you in diagnosing a problem in the cluster are detailed
above. Most problems are simple configuration issues, and hence the commands used
to diagnose them are also straightforward. Also, especially useful are the
/<log_dir>/hacmp.out and /var/hacmp/adm/cluster.log files, which document all of
the output that the HACMP event scripts generate.
10-7
Student Notebook
F2=Refresh
Esc+0=Exit
F3=Cancel
Enter=Do
Esc+8=Image
AU548.0
Notes:
Tools available from the problem determination tools smit menu
We will be looking at some of these tools on the following pages. Not covered are:
- View Current State. This tool executes the /usr/es/sbin/cluster/utilities/cldump
command, which gives the state of the cluster as long as at least one node has
cluster manager services running.
- HACMP Log Viewing and Management. This tool allows you to watch as well as
scan the HACMP log files as well as set options on the /<log_dir>/hacmp.out file
to see event summaries or to see the file in searchable HTML format. watch is
basically a tail -f operation, while scan is to view the entire file.
- Restore HACMP Configuration Database from Active Configuration.
- Release Locks Set By Dynamic Reconfiguration. This was covered in Unit 7.
- Clear SSA Disk Fence Registers.
- HACMP Trace Facility.
- HACMP Error Notification. This was covered in Unit 8.
10-8 HACMP Implementation
V4.0
Student Notebook
Uempty
Enabled
Default
[00]
+
+
+#
AU548.0
Notes:
How it works
The clverify utility runs on one user-selectable HACMP cluster node once every 24
hours. By default, the first node in alphabetical order runs the verification at midnight.
When automatic cluster configuration, monitoring detects errors in cluster configuration,
clverify triggers a general_notification event. The output of this event is logged in
hacmp.out throughout the cluster on each node that is running cluster services.
clverify maintains the log file /var/hacmp/log/clverify/clverify.log.
10-9
Student Notebook
Automatic correction
HACMP Verification and Synchronization
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[Both]
[No]
[No]
[No]
[Standard]
+
+
+
verification?
* Force synchronization if verification fails?
* Verify changes only?
* Logging
F1=Help
Esc+5=Reset
F2=Refresh
Esc+6=Command
F3=Cancel
Esc+7=Edit
F4=List
AU548.0
Notes:
Autocorrection of some verification errors during verify
You can run automatic corrective actions during cluster verification on an inactive
cluster. Automatic correction of clverify errors is not enabled by default. You can choose
to run this useful utility in one of two modes. If you select Interactively, when clverify
detects a correctable condition related to importing a volume group or to exporting and
re-importing mount points and filesystems, you are prompted to authorize a corrective
action before clverify continues error checking. If you select Yes, when clverify detects
that any of the conditions listed as follows exists, it takes the corrective action
automatically without a prompt.
The following errors are detected and fixed:
- Required /etc/services entries are missing on a node.
- HACMP shared volume group time stamps are not up to date on a node.
- The /etc/hosts file on a node does not contain all HACMP-managed IP addresses.
- SSA concurrent volume groups need unique SSA node numbers.
10-10 HACMP Implementation
V4.0
Student Notebook
Uempty
10-11
Student Notebook
V4.0
Student Notebook
Uempty
F1=Help
Esc+9=Shell
F2=Refresh
Esc+0=Exit
F3=Cancel
Enter=Do
Esc+8=Image
AU548.0
Notes:
Test tool description
The Cluster Test Tool utility lets you test an HACMP cluster configuration to evaluate
how a cluster operates under a set of specified circumstances, such as when cluster
services on a node fail or when a node loses connectivity to a cluster network. You can
start a test, let it run unattended, and return later to evaluate the results of your testing.
You should run the tool under both low load and high load conditions to observe how
system load affects your HACMP cluster.
The Cluster Test Tool discovers information about the cluster configuration, and
randomly selects cluster components, such as nodes and networks, to be used in the
testing.
10-13
Student Notebook
V4.0
Student Notebook
Uempty
Mandatory
clstrmgrES
Cluster
Components
clinfoES
Optional
Copyright IBM Corporation 2008
AU548.0
Notes:
clstart subsystems
Listed here are the processes that are listed in the startup smit menu for HACMP. Its
interesting to note that these cluster processes are not displayed by the command
when they are inactive. This was a display option (or probably better a non-display
option) that HACMP chose to use when the subsystems were defined during the install
process. This option can be changed (one subsystem at a time) using the chssys -s
subystem_name -a -D command.
10-15
Student Notebook
alternative command that works in HACMP 5.3 but is not guaranteed for the future is
easier. It is lssrc -ls clstrmgrES | grep state. Look for ST_STABLE for a prolonged
period of time as an indication that cluster services has started successfully. Another
command that will give you state information in HACMP 5.3 is the command
/usr/es/sbin/cluster/utilities/cldump. Finally, you can use the smit path: Problem
Determination Tools -> View Current State.
V4.0
Student Notebook
Uempty
topsvcs
grpsvcs
emsvcs
emaixos
topsvcs
grpsvcs
emsvcs
emsvcs
# lssrc -s clcomdES
Subsystem
Group
clcomdES
clcomdES
# lssrc -s ctrmc
Subsystem
Group
ctrmc
rsct
#
258248
434360
335994
307322
PID
13420
PID
2954
active
active
active
active
Status
active
Status
active
AU548.0
Notes:
Supporting subsystems
Listed here are the additional processes we would expect to find running on an HACMP
cluster node.
10-17
Student Notebook
ping (interfaces)
netstat rn (routing)
host (name resolution)
netstat -i and ifconfig (addresses, subnet mask)
RS232
stty < /dev/tty# (on 2 connected nodes)
AU548.0
Notes:
Testing your IP network
-
V4.0
Student Notebook
Uempty
- For RS232
On one node, execute the command stty < /dev/tty#. This will hang at the
command line.
On the other connected node, execute the command stty < /dev/tty#.
This causes the tty settings to be displayed on both nodes.
- For Target Mode SSA
On one node, execute the command cat < /dev/tmssa#.tm where the value of #
is the node id of the target ssa router.
On the other connected node, execute the command echo test > \
/dev/tmssa#.im where # is the node id of the source ssa router.
This causes the word test to display on the first node.
These tests can be used to validate that network communications are functioning
between cluster nodes over the defined cluster networks.
10-19
Student Notebook
Was it DMS?
AU548.0
Notes:
Dead mans switch
The dead mans switch (DMS) is the AIX kernel extension that halts a node when it
enters a hung state that extends beyond a certain time limit. This enables another node
in the cluster to acquire the hung nodes resources in an orderly fashion, avoiding
possible contention problems. If the dead man switch is not reset in time, it can cause a
system panic and dump under certain cluster conditions.
The dead mans switch should not invoke if your cluster is not overloaded with I/O
traffic. There are steps that can be taken to mitigate the chances of the DMS invoking,
but often this is a result of the machine being fundamentally overloaded.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
Causes of DMS timeouts
Most dead mans switch problems are the result of either an extremely overloaded
cluster node or a sequence of truly bizarre cluster configuration misadventures (for
example, DMS timeouts have been known to occur when the disk subsystem is
sufficiently screwed up that AIX encounters difficulties accessing any disks at all).
Large amounts of TCP traffic over an HACMP-controlled service interface might cause
AIX to experience problems when queuing and later releasing this traffic. When traffic is
released, it generates a large CPU load on the system and prevents timing-critical
threads from running, thus causing the Cluster Manager to issue a DMS timeout.
HACMP via Topology Services produces an AIX error if the time gets close. The error
label is TS_DMS_WARNING_ST and you can set an error notify method to notify you
when this occurs.
10-21
Student Notebook
The command /usr/sbin/rsct/bin/hatsdmsinfo can be used to see how often the DMS
timer is being reset.
Although we don't recommend changing the DMS time-out value, we are sometimes
asked about how to increase the time-out period on the dead mans switch to make it
less likely that the DMS will pop and crash the node. There is no strict time-out setting;
it is monitored by RSCT and is calculated as twice the value of the longest failure
detection rate of all configured HA network in the cluster. If, for example, you have two
networks, an Ethernet, and a disk heartbeat network, the Ethernet has the longer failure
detection rate, 10 seconds versus 8 for the diskhb network; so the DMS time-out is set
to 2*10, or 20 seconds. If the failure detection rate is being modified to extend the DMS
time-out, it is best to ensure that all networks have the same failure detection period. To
set the DMS timeout value to 30 seconds, while making the failure detection the same
for both networks, the custom NIM settings would be: Ethernet:
Failure Cycle
16
Interval between Heartbeats (seconds)
This would increase the DMS timeout from 20 seconds to 32. It would also increase the
amount of time necessary to detect a network failure by the same amount. Note that
because the DMS time-out period is directly tied to failure detection rates, increasing
the DMS time-out period will necessarily increase the delay before the secondary node
starts to acquire resources in the event of a node failure, node hang or the loss of all
network connectivity.
V4.0
Student Notebook
Uempty
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
Extended performance tuning parameter configuration
This is the menu for changing the I/O pacing and syncd frequency.
10-23
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[15]
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Setting the syncd frequency
The syncd setting determines the frequency with which the I/O disk-write buffers are
flushed.
Frequent flushing of these buffers reduces the chance of dead man switch time-outs.
The AIX default value for syncd as set in /sbin/rc.boot is 60. It is recommended to
change this value to 15.
V4.0
Student Notebook
Uempty
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Setting the I/O pacing values
Remember, I/O pacing and other tuning parameters should only be set to values other
than the defaults after a system performance analysis indicates that doing so will lead to
both the desired and acceptable side effects. This should be the option of last resort.
Consider changing the sensitivity of the network components in HACMP before making
this system-wide change.
Although the most efficient high- and low-water marks vary from system to system, an
initial high-water mark of 33 and a low-water mark of 24 provides a good starting point.
These settings only slightly reduce write times and consistently generate correct
fallover behavior from the HACMP software.
See the AIX 5L Performance Monitoring & Tuning Guide for more information on I/O
pacing.
10-25
Student Notebook
AU548.0
Notes:
How SRC halt works
The SRC looks for an entry in the /etc/objrepos/SRCnotify odm file if a subsystem is
killed or crashed. HACMP provides an entry for the clstrmgr. This entry causes clexit.rc
to run which does a halt q by default.
V4.0
Student Notebook
Uempty
What happens:
Group Services and clstrmgr exit on some node(s)
AU548.0
Notes:
Node isolation
When you have a partitioned cluster, the node or nodes on each side of the partition
detect this and run a node_down for the node or nodes on the opposite side of the
partition. If, while running this or after communication is restored, the two sides of the
partition do not agree on which nodes are still members of the cluster, a decision is
made as to which partition should remain up, and the other partition is shutdown by a
Group Services (GS) merge from nodes in the other partition or by a node sending a GS
merge to itself.
In clusters consisting of more than two nodes, the decision is based on which partition
has the most nodes left in it, and that partition stays up. With an equal number of nodes
in each partition (as is always the case in a two-node cluster), the node or nodes that
remain up are determined by the node number (lowest node number in cluster
remains), which is also generally the first in alphabetical order.
10-27
Student Notebook
V4.0
Student Notebook
Uempty
Avoid bridges
AU548.0
Notes:
What can go wrong?
A partitioned cluster can result in data divergence (two cluster nodes each gain access
to half of the disks mirrors and proceed to perform updates on their halves). This is a
scenario that can be extremely difficult to completely recover from because the changes
made by the two nodes might be fundamentally incompatible and impossible to
reconcile.
10-29
Student Notebook
AU548.0
Notes:
Uses the clsnap command under the covers local collection only. The clsnap utility runs
with the report option first to verify there is enough space.
The user can disable these specific FFDC actions by setting the environment variable
FFDC_COLLECTION to disable before starting cluster services.
V4.0
Student Notebook
Uempty
AU548.0
Notes:
The config _too_long event
For each cluster event that does not complete within the specified event duration time,
config_too_long messages are logged in the hacmp.out file and sent to the console
according to the following pattern:
- First five config_too_long messages appear in the hacmp.out file at 30-second
intervals.
- Next set of five messages appears at interval that is double the previous interval
until the interval reaches one hour.
- These messages are logged every hour until the event is complete or is terminated
on that node.
This error can occur if an event script fails or does not complete within a customizable
time period, which by default is 360 seconds.
10-31
Student Notebook
V4.0
Student Notebook
Uempty
#
#
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
smit menu
smit hacmp -> Extended Configuration -> Extended Event Configuration ->
Change/Show Time Until Warning
10-33
Student Notebook
AU548.0
Notes:
Why recovery from script failure is necessary
If an event script fails or takes too long, the Please check event status message starts
to display as described on the previous visual. HACMP stops processing cluster events
until the situation is resolved. If the problem is that an event took too long, then the
problem might soon solve itself. If a HACMP event script has actually failed, then
manual intervention is required.
The procedure
The procedure is outlined in the visual above. Using the /var/hacmp/adm/cluster.log
file with the command grep EVENT /var/hacmp/adm/cluster.log | more makes it
easier to find when the config too long event first occurred. Be sure to find the earliest
AIX error message--not just the first AIX error message. You must manually complete
what the event would have done before doing recover from script failure, which is
V4.0
Student Notebook
Uempty
described on the next visual. You can also use the cluster.log in combination with
hacmp.out.
10-35
Student Notebook
Select a Node
usa
uk
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
What this procedure does
This SMIT menu entry can be used to recover from a script failure. This does not mean
that HACMP fixes problems in event scripts, but this menu is used to allow the cluster
manager to continue to the next event following an event script failure that you have
identified and manually corrected. Select the node experiencing the problem and press
Enter.
V4.0
Student Notebook
Uempty
A troubleshooting methodology
Save the log files from all available nodes as soon as possible
Attempt to duplicate the problem
Approach the problem methodically
Distinguish between what you know and what you assume
Keep an open mind
Isolate the problem
Go from the simple to the complex
Make one change at a time
Stick to a few simple troubleshooting tools
Do not neglect the obvious
Watch for what the cluster is not doing
Keep a record of the tests you have completed
AU548.0
Notes:
Troubleshooting suggestions
Save the log files from every available cluster node while they are still available
Things might get much worse than they already are. Having access to all relevant
cluster log files and application log files could prove very important. These log files
might be overwritten while you are investigating the problem or they might be lost
entirely if more hardware failures occur. Save copies of them very early in the
troubleshooting exercise to ensure that they are not lost.
Attempt to duplicate the problem
While keeping in mind the importance of not making a bad situation worse by causing
even more problems, it is often useful to try to duplicate the circumstances that are
believed to have been in effect when the problem occurred; this can lead to a greater
understanding of exactly what went wrong.
10-37
Student Notebook
V4.0
Student Notebook
Uempty
10-39
Student Notebook
Checked
AU548.0
Notes:
What to do when contacting IBM
The visual above summarizes the steps. It is a very good idea to collate as much of this
information in advance of having a problem as is possible, especially snapshots and the
cluster diagram. If you have not already got this information assembled at your office for
your existing clusters, you are strongly recommended to do so as soon as you get back.
V4.0
Student Notebook
Uempty
Checkpoint
1. What is the most common cause of cluster failure?
(Select all that apply.)
a. Bugs in AIX or HACMP
b. Cluster administrator error
c. Marauding space aliens from another galaxy
d. Cosmic rays
e. Poor/inadequate cluster design
2. True or False?
Event emulation can emulate all cluster events.
4. True or False?
A non-IP network is strongly recommended. Failure to include a nonIP network can cause the cluster to fail or malfunction in rather ugly
ways.
Copyright IBM Corporation 2008
AU548.0
Notes:
10-41
Student Notebook
Unit summary
Having completed this unit, you should be able to:
List reasons why HACMP can fail
Identify configuration and administration errors
Explain why the Dead Man's Switch invokes
Explain when the System Resource Controller will kill a node
Isolate and recover from failed event scripts
Correctly escalate a problem to IBM support
AU548.0
Notes:
V4.0
Student Notebook
AP
2.
3.
4.
Which of the following items are examples of topology components in HACMP? (Select
all that apply.)
a. Node
b. Network
c. Service IP label
d. Hard disk drive
True or False?
All nodes in an HACMP cluster must have roughly equivalent performance
characteristics.
Which of the following is a characteristic of high availability?
a. High availability always requires specially designed hardware components.
b. High availability solutions always require manual intervention to ensure
recovery following fallover.
c. High availability solutions never require customization.
d. High availability solutions use redundant standard equipment (no specialized
hardware).
True or False?
A thorough design and detailed planning is required for all high availability solutions.
A-1
Student Notebook
Checkpoint solutions
1. True or False?
Resource Groups can be moved from node to node.
2. True or False?
HACMP/XD is a complete solution for building
geographically distributed clusters.
3. Which of the following capabilities does HACMP not
provide? (Select all that apply.):
a. Time synchronization
b. Automatic recovery from node and network adapter failure
c. System Administration tasks unique to each node; back-up
and restoration
d. Fallover of just a single resource group
4. True or False?
All nodes in a resource group must have equivalent
performance characteristics.
Copyright IBM Corporation 2008
A-2
HACMP Implementation
V4.0
Student Notebook
AP
A-3
Student Notebook
3. True or False?
Persistent node IP labels are not supported for IPAT via IP
replacement.
4. True or False?
There are no exceptions to the rule that, on each node, each NIC on
the same LAN must have an IP address in a different subnet.
(The HACMP 5.1 heartbeat over IP aliases feature is the exception to this rule.)
Copyright IBM Corporation 2008
A-4
HACMP Implementation
V4.0
Student Notebook
AP
2. True or False?
All networking technologies supported by HACMP support IPAT via IP aliasing.
3. True or False?
All networking technologies supported by HACMP support IPAT via IP replacement.
4. If the left node has NICs with the IP addresses 192.168.20.1 and 192.168.21.1 and the
right hand node has NICs with the IP addresses 192.168.20.2 and 192.168.21.2, then
which of the following options are valid service IP addresses if IPAT via IP aliasing is
being used? (Select all that apply.)
a.(192.168.20.3 and 192.168.20.4) or (192.168.21.3 and 192.168.21.4)
b.192.168.20.3 and 192.168.20.4 and 192.168.21.3 and 192.168.21.4
c. 192.168.22.3 and 192.168.22.4
d.192.168.23.3 and 192.168.24.3
5. If the left node has NICs with the IP addresses 192.168.20.1 and 192.168.21.1 and the
right hand node has NICs with the IP addresses 192.168.20.2 and 192.168.21.2, then
which of the following options are valid service IP addresses if IPAT via IP replacement is
being used? (Select all that apply.)
a.(192.168.20.3 and 192.168.20.4) or (192.168.21.3 and 192.168.21.4)
b.192.168.20.3, 192.168.20.4, 192.168.21.3 and 192.168.21.4
c. 192.168.22.3 and 192.168.22.4
d.192.168.23.3 and 192.168.24.3
Copyright IBM Corporation 2008
A-5
Student Notebook
Checkpoint solutions
1. True or False?
Clients are required to exit and restart their application after a
fallover.
2. True or False?
All client systems are potentially directly affected by the ARP cache
issue.
3. True or False?
clinfo must not be run both on the cluster nodes and on the
client systems.
A-6
HACMP Implementation
V4.0
Student Notebook
AP
2. True or False?
3. True or False?
A-7
Student Notebook
SCSI
SSA
FC
All of the above
2. True or False?
3. True or False?
4. True or False?
No special considerations are required when using SAN based storage units
(DS8000, ESS, EMC HDS, and so forth).
5. True or False?
hdisk numbers must map to the same PVIDs across an entire HACMP
cluster.
Copyright IBM Corporation 2008
A-8
HACMP Implementation
V4.0
Student Notebook
AP
Checkpoint solutions
1.True or False?
Lazy update attempts to keep VGDA constructs in sync between
cluster nodes (reserve/release-based shared storage protection).
3.True or False?
Quorum should always be disabled on shared volume groups.
4.True or False?
Filesystem and logical volume attributes cannot be changed while
the cluster is operational.
5.True or False?
An enhanced concurrent volume group is required for the heartbeat
over disk feature.
Copyright IBM Corporation 2008
A-9
Student Notebook
Checkpoint solutions
1. True or False
Applications are defined to HACMP in a configuration file that lists
what binary to use.
2.What policies would be the best to use for a 2-node activeactive cluster using IPAT to minimize both applications running
on the same node?
a.home, next, never
b.first, next, higher
c.distribution, next, never
d.all, error, never
e.home, next, higher
V4.0
Student Notebook
AP
A-11
Student Notebook
Checkpoint solutions
1. Which component detects an adapter failure?
a.
b.
c.
d.
Cluster Manager
RSCT
clcomd
clinfo
Cluster Manager
RSCT
clsmuxpd
clinfo
Cluster Manager
RSCT
clcomd
clinfo
Cluster Manager
RSCT
clcomd
clinfo
Copyright IBM Corporation 2008
V4.0
Student Notebook
AP
Checkpoint solutions
1.
True or False?
It is possible to configure a recommended simple two-node cluster environment
using just the standard configuration path.
You cant create the non-IP network from the standard path.
2.
In which of the top-level HACMP menu choices is the menu for starting and
stopping cluster nodes?
a.
b.
c.
d.
3.
In which of the top-level HACMP menu choices is the menu for defining a nonIP heartbeat network?
a.
b.
c.
d.
4.
True or False?
It is possible to configure HACMP faster by having someone help you on the other
node.
5.
True or False?
You must specify exactly which filesystems you want mounted when you put
resources into a resource group.
Copyright IBM Corporation 2008
A-13
Student Notebook
2. You have decided to add a third node to your existing twonode HACMP cluster. What very important step follows
adding the node definition to the cluster configuration
(whether through Standard or Extended Path)?
a. Take a well deserved break, bragging to co-workers about
your success.
b. Install HACMP software.
c. Configure a non-IP network.
d. Start Cluster Services on the new node.
e. Add a resource group for the new node.
__________________________________________________
Copyright IBM Corporation 2008
V4.0
Student Notebook
AP
True or False?
Using C-SPOC reduces the likelihood of an outage by reducing the
likelihood that you will make a mistake.
2.
True or False?
3.
4.
True or False?
It does not matter which node in the cluster is used to initiate a C-SPOC
operation.
5.
A-15
Student Notebook
3. True or False?
It is possible to roll back from a successful DARE operation using an
automatically generated snapshot.
4. True or False?
Running a DARE operation requires three separate copies of the
HACMP ODM.
5. True or False?
Cluster snapshots can be applied while the cluster is running.
6. What is the purpose of the dynamic reconfiguration lock?
a. To prevent unauthorized access to DARE functions
b. To prevent further changes being made until a DARE operation has
completed
c. To keep a copy of the previous configuration for easy rollback
Copyright IBM Corporation 2008
V4.0
Student Notebook
AP
Checkpoint solutions
1.
2.
3.
4.
5.
True or False?
A star configuration is a good choice for your non-IP networks.
True or False?
Using DARE, you can change from IPAT via aliasing to IPAT via
replacement without stopping the cluster.
True or False?
RSCT will automatically update /etc/filesystems when using enhanced
concurrent mode volume groups
True or False?
With HACMP V5.4, a resource groups priority override location can be
cancelled by selecting a destination node of
Restore_Node_Priority_Order.
You want to create an Enhanced Concurrent Mode Volume Group that
will be used in a Resource Group that will have an Online on Home
Node Startup policy. Which C-SPOC menu should you use?
a. HACMP Logical Volume Management
b. HACMP Concurrent Logical Volume Management
6.
You want to add a logical volume to the volume group you created in the
question above. Which C-SPOC menu should you use?
a. HACMP Logical Volume Management
b. HACMP Concurrent Logical Volume Management
A-17
Student Notebook
Unit 8 - Events
Which of the following are examples of primary HACMP events (select all that
apply)?
a.
b.
c.
d.
e.
2.
node_up
node_up_local
node_up_complete
start_server
Rg_up
When a node joins an existing cluster, what is the correct sequence for these
events?
a.
b.
c.
d.
V4.0
Student Notebook
AP
Unit 8 - Events
Checkpoint solutions
1. Which of the following runs if an HACMP event script fails?
(select all that apply)
a.Pre-event scripts
b.Post-event scripts
c.Error notification methods
d.Recovery commands
e.Notify methods
3. True or False?
Pre-event scripts are automatically synchronized.
4. True or False?
Writing error notification methods is a normal part of
configuring a cluster.
Copyright IBM Corporation 2008
A-19
Student Notebook
Checkpoint solutions
1.
True or False? *
2.
3.
/abc is the name of the filesystem that is exported and /xyz is where it should be
mounted
/abc is where the filesystem should be mounted, and /xyz is the name of the
filesystem that is exported
4.
True or False? **
5.
True or False?
V4.0
Student Notebook
AP
Checkpoint solutions
1.
What is the most common cause of cluster failure? (Select all that apply.)
a. Bugs in AIX or HACMP
b. Cluster administrator error
c. Marauding space aliens from another galaxy
d. Cosmic rays
e. Poor/inadequate cluster design
2. True or False?
Event emulation can emulate all cluster events.
3. If the cluster manager process dies, what will happen to the cluster
node?
a. It continues running but without HACMP to monitor and protect it.
b. It continues running AIX but any resource groups will fallover.
c. Nobody knows because this has never happened before.
d. The System Resource Controller sends an e-mail to root and
issue a halt -q.
e. The System Resource Controller sends an e-mail to root and issue a
shutdown -F.
4. True or False?
A non-IP network is strongly recommended. Failure to include a nonIP network can cause the cluster to fail or malfunction in rather ugly
ways.
*The correct answer is almost certainly "cluster administrator error" although
"poor/inadequate cluster design" would be a very close second.
Copyright IBM Corporation 2008
A-21
Student Notebook
Checkpoint solutions
1. For IPAT via replacement (select all that apply)
a. Each service IP address must be in the same subnet as one of
the non-service addresses
b. Each service IP address must be in the same subnet
c. Each service IP address cannot be in any non-service address subnet
2.
True or False?
If the takeover node is not the home node for the resource group and
the resource group does not have a Startup policy of Online Using
Distribution Policy, the service IP address replaces the IP address of a
NIC with an IP address in the same subnet as the subnet of the
service IP address.
3.
True or False?
In order to use HWAT, you must enable and complete the
ALTERNATE ETHERNET address field in the SMIT devices menu.
4.
True or False?
You must stop the cluster in order to change from IPAT via aliasing to
IPAT via replacement.
Copyright IBM Corporation 2008
V4.0
Student Notebook
AP
==========================================
Enhancements of the HACMP Software
==========================================
-----------------------------------5.4.1 Enhancements
------------------------------------
Integrated support for utilizing AIX Workload Partition (WPAR) to maintain high
availability for your applications by configuring them as a resource group and assigning
the resource group to an AIX WPAR. By using HACMP in combination with AIX WPAR,
you can leverage the advantages of application environment isolation and resource
B-1
Student Notebook
control provided by AIX WPAR along with the high availability feature of HACMP
V5.4.1.
HACMP/XD support of PPRC Consistency Groups to maintain data consistency for
application-dependent writes on the same logical subsystem (LSS) pair or across
multiple LSS pairs. HACMP/XD responds to PPRC consistency group failures by
automatically freezing the pairs and managing the data mirroring.
A new Geographical Logical Volume Manager (GLVM) Status Monitor that provide the
ability to monitor GLVM status and state. These monitors enable you to keep better
track of the status of your application data when using the HACMP/XD GLVM option for
data replication.
Improved support for NFS V4, which includes additional configuration options, as well
as improved recovery time. HACMP can support both NFS V4 and V2/V3 within the
same high availability environment.
Usability improvements for the WebSMIT Graphical User Interface, which include the
ability to customize the color and appearance of the display. Improvements to First
Failure Data Capture and additional standardized logging increase the reliability and
serviceability of HACMP 5.4.1.
New options for detecting and responding to a partitioned cluster. Certain failures or
combinations of failures can lead to a partitioned cluster, which, in the worse case, can
lead to data divergence (out of sync data between the primary and backup nodes in a
cluster). HACMP V5.4.1 introduces new features for detecting a partitioned cluster and
avoiding data divergence through earlier detection and reporting.
Serviceability Improvements for HACMP. New log files have been added. The default
locations of all managed log files have been moved to a subdirectory of /var/hacmp.
-----------------------------------5.4.0 Enhancements
------------------------------------
HACMP Implementation
V4.0
Student Notebook
AP
- Start and restart cluster services automatically according to how you define the
resources.
- Stop cluster services and also bring the resources and applications offline, move
them to other nodes, or keep them running on the same nodes (but stop managing
them for high availability).
- Terminology that describes stopping cluster services has changed:
Instead of stopping cluster services gracefully, this option is known as stopping
cluster services and bringing resource groups offline. The cluster services are
stopped.
Instead of stopping cluster services gracefully with takeover, this option is known
as stopping cluster services and moving the resource groups to other nodes.
Instead of a forced down, this option is known as stopping cluster services
immediately and placing resource groups in an unmanaged state. This option
leaves resource groups on the local node active.
Resource Group Management (clRGmove) improvements
- Improved SMIT interface.
- Easier to move the resource groups for cluster management.
- When you move a resource group, you can move it without setting the Priority
Override Location (POL) for the node to which it was moved. POL is a setting you
had to specify for manually moved resource groups in releases prior to HACMP 5.4.
- Improved handling of non-concurrent resource groups with No Fallback resource
group and site policies.
- Clear method to maintain the previously configured behavior for a resource group.
- Improved status and troubleshooting with WebSMIT and clRGinfo.
Verification enhancements
- The final verification report lists any nodes, networks and/or network interfaces that
are in the 'failed' state at the time that cluster verification is run. The final verification
report also lists other 'failed' components, if accessible from the Cluster Manager,
such as applications, resource groups, sites, and application monitors that are in the
suspended state.
- Volume group verification checks have been restructured for faster processing.
- Messages have been reformatted for consistency and to remove repetitious entries.
- New Verification checks:
Can each node reach each other node in the cluster through non-IP
connections?
Are netmasks and broadcast addresses valid?
B-3
Student Notebook
HACMP Implementation
V4.0
Student Notebook
AP
- You can configure multiple XD_data networks for the mirroring function. IPAT via IP
Aliases is not allowed for HAGEO.
- You can configure multiple XD_rs232 networks for cluster heartbeating.
====================================
Installation and Migration Notes
====================================
For HACMP version 5.4, planning and installation information is split into two separate
guides: the Planning Guide and the Installation Guide.
HACMP for Linux: Installation and Administration Guide v5.4 is the first edition of a new
manual.
The Online Planning Worksheets (OLPW) application is now available for download
from the installable image, worksheets.jar, that is located at this URL:
http://www.ibm.com/systems/p/ha/ha_olpw.html
Once you accept the license agreement, locate the worksheets.jar file and click on it.
Or, run the following command from the AIX 5L command line:
java -jar worksheets.jar
You can apply a PTF to HACMP 5.4.1 on an individual node using rolling migration,
while your critical applications and resources continue running on that node although
they will not be highly available during the upgrade.
Methods of installation and migration supported in previous releases of HACMP are still
supported.
HACMP 5.4.1 is a modification release. There are both base-level filesets and update (ptf)
images. Users should use a consistent method for upgrading their HACMP cluster nodes.
Do not mix base-level filesets on some nodes and update (ptf) images on others in the
same cluster.
-------------------------------------------------------------------------
B-5
Student Notebook
This note applies only to 64-bit systems. You may ignore this note if all of your cluster
nodes are 32 bit.
The NFS daemon nfsd must be restarted on each cluster node with grace periods enabled
after installing cluster.es.nfs.rte before configuring NFSv4 exports. This step is required;
otherwise, NFSv4 exports will fail to export with the misleading error message
exportfs: <export_path>: No such file or directory
The following commands enable grace periods and restart the NFS daemon.
chnfs -I -g on -x
stopsrc -s nfsd
startsrc -s nfsd
Please note that this will impact the availability of all exported filesystems on the machine,
therefore the best time to perform this step is when all resource groups with NFS exports
are offline or failed over to another node in the resource group.
The behavior of stopping a cluster node with the option to unmanage resource groups
(previously known as the force option) was significantly modified with HACMP 5.4.0. Prior
to the HACMP 5.4.0 release, this operation did bring the cluster manager daemon to a
stopped state and this was reflected by clstat showing the cluster node's status as
DOWN.
The modifications to this feature in HACMP 5.4.0 necessitated leaving the cluster manager
daemon in an online state (there were multiple motivations for this change in behavior-one was that it was required to allow Enhanced Concurrent Mode Volume Groups to
remain online). Consequently, clstat run on an HACMP 5.4.0 or later cluster will display
such cluster node's status as UP instead of DOWN as they were displayed before HACMP
5.4.0.
B-6
HACMP Implementation
V4.0
Student Notebook
AP
One other thing to keep in mind is that when migrating a cluster from before HACMP 5.4.0
to HACMP 5.4.0 or later, where some nodes are pre- 5.4.0 and others are 5.4.0 or later,
clstat run on cluster node A will display cluster node B's status following the conventions of
cluster node A. For example, if clstat is run on a 5.4.1 cluster node, it will display all forced
down nodes as UP whereas running the same clstat command on an HACMP 5.3.0 cluster
node will show those same nodes as DOWN.
Install the HACMP 5.3 APAR IY85489 to avoid having to start a 5.3 node from a 5.4 node.
Unless you have this APAR, when you have upgraded any node to HACMP 5.4, if you need
to start a 5.3 node while any 5.4 nodes are active, you must start the 5.3 node from a 5.4
node.
If you are upgrading and have nodes that are 5.2 or earlier and must start the 5.2 or earlier
node, start it from a downlevel node of 5.2 or lower.
==============================================
Required Release of AIX 5L for HACMP 5.4.1
==============================================
AIX 5L 5.2 ML8 with RSCT version 2.3.9 (APAR IY84921) or higher
AIX 5L 5.3 ML4 with RSCT version 2.4.5 (APAR IY84920) or higher
AIX 5L 6.1 with RSCT version 2.5.0 or higher
==============================================
HACMP Configuration Restrictions
==============================================
HACMP configuration restrictions remain the same as in previous releases and are as
follows:
Maximum nodes per cluster: 32
Maximum number of sites: 2
Minimum number of nodes per site: 1
Copyright IBM Corp. 1998, 2008
B-7
Student Notebook
======================
Notes on Functionality
======================
===================================
HACMP 5.4.1 Documentation
===================================
-------------------------------Order Numbers and Document Names
-------------------------------Order numbers for 5.4.1 documentation are as follows:
SC23-4864-10
Planning Guide
SC23-4861-10
Installation Guide
SC23-5209-01
Administration Guide
SC23-4862-10
Troubleshooting Guide
SC23-5177-04
Master Glossary
SC23-4867-09
SC23-4865-10
SC23-4863-11
SA23-1338-06
SC23-4877-08
SC23-5178-04
B-8
HACMP Implementation
V4.0
Student Notebook
AP
SC23-5179-04
SC23-5210-01
SC23-5211-01
------------------------------------------------------
Documentation for HACMP 5.4.1 is supplied in PDF format. You may want to install the
documentation before doing the full install of the product, to read the chapters on
installation procedures or the description of migration.
Image cluster.doc.en_US.es
-------------------------cluster.doc.en_US.es.html
B-9
Student Notebook
cluster.doc.en_US.es.pdf
HAES PDF
Image cluster.doc.en_US.glvm
---------------------------cluster.doc.en_US.glvm.html HACMP GLVM HTML Documentation
- U.S. English
cluster.doc.en_US.glvm.pdf
- U.S. English
Image cluster.doc.en_US.pprc
---------------------------cluster.doc.en_US.pprc.html PPRC Web-based HTML
Documentation - U.S. English
cluster.doc.en_US.pprc.pdf
- U.S. English
Image cluster.doc.en_US.assist
-----------------------------cluster.doc.en_US.assist.db2.html
V4.0
Student Notebook
WebSphere HTML
AP
Documentation
- U.S. English
cluster.doc.en_US.assist.websphere.pdf
WebSphere PDF
Documentation
- U.S. English
After you install the documentation, store it on a server that is accessible through the
Internet. You can view the documentation in the Mozilla Firefox browser.
Note: Installing all of the documentation requires about 46 MB of space in the /usr
filesystem. (PDF files = 26 MB, HTML files = 20 MB.)
5. Select all filesets that you wish to install and execute the command.
/usr/share/man/info/en_US/cluster/HAES
The titles of the HACMP for AIX 5L products, Version 5.4.1, documentation set are:
B-11
Student Notebook
--------------------------------------------Accessing Documentation
---------------------------------------------
NOTE: The Smart Assist Guides and the Smart Assist Developer's Guide are installed with
the base fileset. They are described in a separate Smart Assists release notes
Use the following command to determine the exact files loaded into product directories
when installing the HACMP for AIX 5L, version 5.4.1:
lslpp -f cluster*
==================
PRODUCT MAN PAGES
==================
Man pages for HACMP commands and utilities are installed in the following directory:
/usr/share/man/cat1
========================
Accessing IBM on the Web
========================
V4.0
Student Notebook
AP
========
Feedback
========
IBM welcomes your comments. You can send any comments via e-mail to:
B-13
Student Notebook
V4.0
Student Notebook
AP
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
C-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Explain and set up IP Address Takeover (IPAT) via IP
replacement
AU548.0
Notes:
C-2
HACMP Implementation
V4.0
Student Notebook
AP
9.47.10.1 (ODM)
9.47.11.1 (ODM)
9.47.10.2 (ODM)
9.47.11.2 (ODM)
* See earlier discussion of heartbeating and failure diagnosis for explanation of why
Copyright IBM Corporation 2008
AU548.0
Notes:
Requirements
Keep the following items in mind when you configure a network for IPAT via IP
replacement:
- There must be at least one logical IP subnet that has a communication interface
(NIC) on each node. (In HACMP 4.5 terminology, these were called boot adapters.)
- Each service IP address must be in the same logical IP subnet as one of the
non-service addresses. Contrast with IPAT via IP aliasing, where service addresses
are required to not be in a boot subnet.
- If you have more than one service IP address, they must all be in the same subnet.
The reason for this will become clear when we discuss what happens during a
takeover, see IPAT via IP replacement after a node fails on page C-8.
- None of the other non-service addresses may be in the same subnet as the service
IP address (this is true regardless of whether IPAT via IP replacement is being used
Copyright IBM Corp. 1998, 2008
C-3
Student Notebook
because the NICs on each node are required to be on different IP subnets in order
to support heartbeating).
- All network interfaces must have the same subnet mask.
NIC
en0
en1
en0
en1
Service address
Service address
subnet
192.168.10/24
192.168.11/24
C-4
HACMP Implementation
IP Label
n1-if1
n1-if2
n2-if1
n2-if2
appA-svc
appB-svc
IP Address
192.168.10.1
192.168.11.1
192.168.10.2
192.168.11.2
192.168.10.22
192.168.10.25
IP labels
n1-if1, n2-if1, appA-svc,
appB-svc
n1-if2, n2-if2
V4.0
Student Notebook
AP
9.47.10.22 (service)
9.47.11.1 (ODM)
9.47.10.2 (ODM)
9.47.11.2 (ODM)
AU548.0
Notes:
Operation
When the resource group comes up on its home node, the resource groups service IP
address replaces the interface IP address of the NIC (AIX ODM), which is in the same
subnet as the service IP label (that is, the boot adapter in HACMP 4.x terminology).
Note that this approach implies that there cannot be two resource groups in the cluster
that both use IPAT via IP replacement and use the same node as their home node
unless their respective service IP addresses are in different subnets (in other words,
associated with different physical networks).
Also, since the service IP address replaces the existing IP address on the NIC, it is not
possible to have two or more service IP addresses in the same resource group, which
are in the same IP subnet (as there will not be an adapter to assign the second service
IP address to).
When the resource group comes up on any node other than its home node, the
resource groups service IP address replaces the interface IP address of one of the
Copyright IBM Corp. 1998, 2008
C-5
Student Notebook
NICs which is not in the same subnet as the service IP address (this is primarily to allow
some other resource group to use the node as its home node).
C-6
HACMP Implementation
V4.0
Student Notebook
AP
NIC A
9.47.11.1 (ODM)
NIC B
9.47.10.22 (service)
9.47.10.2 (ODM)
9.47.11.2 (ODM)
AU548.0
Notes:
Interface failure
If a communications interface (NIC A), which is currently assigned an IPAT via IP
replacement service IP address, fails, then HACMP moves the service IP address to
one of the other communication interfaces (NIC B) on the same node (to one of the
standby adapters using HACMP 4.x terminology).
If there are no available (that is, functional) NICs left, the relevant network then HACMP
initiates a fallover.
Interface swap
The failed communications interface (NIC A) is then reconfigured with the address of
the communication interface (NIC B) as this allows the heartbeat mechanism to watch
for when the failed communication interface (NIC A) recovers.
C-7
Student Notebook
9.47.10.2 (ODM)
9.47.10.22 (service)
AU548.0
Notes:
Node failure
If the node currently responsible for an IPAT via IP replacement using resource group
fails, then HACMP initiates a fallover. When the resource group comes up on the
takeover node, the service IP addresses are assigned to NICs on the fallover node:
- Home node or Startup policy of Online Using Distribution Policy (rotate in
HACMP 4.x terminology)
If the takeover node is the home node for the resource group or the resource group
has a Startup policy of Online Using Distribution Policy (rotate in HACMP 4.x
terminology), the Service IP addresses replace the IP addresses of a
communications interface (NIC) with an IP address in the same subnet as the
service IP address.
- Not the home node and not Online Using Distribution Policy
If the takeover node is not the home node for the resource group and the resource
group does not have a Startup policy of Online Using Distribution Policy, the
C-8
HACMP Implementation
V4.0
Student Notebook
AP
C-9
Student Notebook
AU548.0
Notes:
Advantages
Probably the most significant advantage of IPAT via IP replacement is that it supports
hardware address takeover (HWAT), which will be discussed in a few pages.
Another advantage is that it requires fewer subnets. If you are limited in the number of
subnets available for your cluster, this may be important.
Note: Another alternative, if you are limited on the number of subnets you have
available, is to use heartbeating via IP aliases. See Heartbeating Over IP Aliases in the
HACMP for AIX, Version 5.4.1 Planning Guide.
Disadvantages
Probably the most significant disadvantages are that IPAT via IP replacement limits the
number of service IP labels per subnet per resource group on one communications
C-10 HACMP Implementation
V4.0
Student Notebook
AP
interface to one and makes it rather expensive (and complex) to support lots of
resource groups in a small cluster. In other words, you need more network adapters to
support more applications.
Also, IPAT via replacement usually takes more time than IPAT via aliasing.
Note that HACMP tries to keep the service IP Labels available by swapping IP
addresses with other communication interfaces (standby adapters in HACMP 4.x
terminology) even if there are no resource groups currently on the node that uses IPAT
via IP replacement.
C-11
Student Notebook
AU548.0
Notes:
Review
When using IPAT via aliasing, you can use AIXs gratuitous ARP features to update
client and router ARP caches after a takeover. However, there may be issues.
V4.0
Student Notebook
AP
to be causing the cluster and the cluster administrator far more serious problems than
the ARP cache issue involves.)
C-13
Student Notebook
Suggestion:
Do not get involved with using either clinfo or HWAT to deal with
ARP cache issues until you've verified that there actually are ARP
issues which need to be dealt with.
AU548.0
Notes:
If gratuitous ARP is not supported
HACMP supports three alternatives to gratuitous ARP. The first two are discussed in
Unit 3. We will discuss the third option here.
V4.0
Student Notebook
AP
AU548.0
Notes:
Hardware address takeover
Hardware Address Takeover (HWAT) is the most robust method of dealing with the ARP
cache issue as it ensures that the hardware address associated with the service IP
address does not change (which avoids the whole issue of whether the client systems
ARP cache is out-of-date).
The essence of HWAT is that the cluster configurator designates a hardware address
that is to be associated with a particular service IP address. HACMP then ensures that
whichever NIC the service IP address is on also has the designated hardware address.
C-15
Student Notebook
HWAT considerations
There are a few points which must be kept in mind when contemplating HWAT:
- The hardware address that is associated with the service IP address must be unique
within the physical network that the service IP address is configured for.
- HWAT is not supported by IPAT via IP aliasing because each NIC can have more
than one IP address, but each NIC can only have one hardware address.
- HWAT is only supported for Ethernet, token ring, and FDDI networks (MCA FDDI
network cards do not support HWAT). ATM networks do not support HWAT.
- HWAT increases the takeover time (usually by just a few seconds).
- HWAT is an optional capability which must be configured into the HACMP cluster
(we will see how to do that in a few minutes).
- Cluster nodes using HWAT on token ring networks must be configured to reboot
after a system crash because the token ring card will continue to intercept packets
for its hardware address until the node starts to reboot.
V4.0
Student Notebook
AP
tr0
hudson-if1
9.47.9.2
255.255.255.0
00:04:ac:62:72:61
bondar-if1
9.47.9.1
255.255.255.0
00:04:ac:48:22:f4
bondar-if2
9.47.5.3
255.255.255.0
00:04:ac:62:72:49
tr1
Before
resource
group is
started
hudson-if2
9.47.5.2
255.255.255.0
00:04:ac:48:22:f6
tr0
tr0
Hudson
Bondar
tr1
tr1
hudson-if1
9.47.9.2
255.255.255.0
00:04:ac:62:72:61
bondar-if1
9.47.9.1
255.255.255.0
00:04:ac:48:22:f4
xweb
9.47.5.1
255.255.255.0
40:04:ac:62:72:49
After
resource
group is
started
hudson-if2
9.47.5.2
255.255.255.0
00:04:ac:48:22:f6
tr0
Hudson
Bondar
Copyright IBM Corporation 2008
AU548.0
Notes:
Hardware address takeover Boot time
At boot time, the interfaces are assigned their normal hardware addresses.
C-17
Student Notebook
hudson-if1
9.47.9.2
255.255.255.0
00:04:ac:62:72:61
tr1
Interface
failure
xweb
9.47.5.1
255.255.255.0
40:04:ac:62:72:49
hudson-if2
9.47.5.2
255.255.255.0
00:04:ac:48:22:f6
Hudson
Bondar
xweb
9.47.5.1
255.255.255.0
40:04:ac:62:72:49
bondar-if1
9.47.9.1
255.255.255.0
00:04:ac:48:22:f4
Node failure
xweb
9.47.5.1
255.255.255.0
40:04:ac:62:72:49
hudson-if2
9.47.5.2
255.255.255.0
00:04:ac:48:22:f6
Bondar
Hudson
Copyright IBM Corporation 2008
AU548.0
Notes:
HWAT: interface or node failure
If a NIC (with a service IP address that has an LAA) fails, HACMP moves the IP
address to a NIC on the takeover node. It also moves the LAA (alternative hardware
address) to the same NIC.
If a node fails, the service IP address, and its associated LAA, are moved to another
node.
The result, in both of these cases, is that the local clients ARP caches are still up to
date because the HW address associated with the IP address has not changed.
V4.0
Student Notebook
AP
tr0
xweb
9.47.5.1
255.255.255.0
40:04:ac:62:72:49
bondar-if1
9.47.9.1
255.255.255.0
00:04:ac:48:22:f4
bondar-if2
9.47.5.3
255.255.255.0
00:04:ac:62:72:49
hudson-if2
9.47.5.2
255.255.255.0
00:04:ac:48:22:f6
tr0
tr0
Hudson
Bondar
tr1
tr1
bondar-if1
9.47.9.1
255.255.255.0
00:04:ac:48:22:f4
xweb
9.47.5.1
255.255.255.0
40:04:ac:62:72:49
hudson-if1
9.47.9.2
255.255.255.0
00:04:ac:62:72:61
After HACMP is
started the node
reintegrates
according to its
resource group
parameters
hudson-if2
9.47.5.2
255.255.255.0
00:04:ac:48:22:f6
tr1
tr0
Hudson
Bondar
Copyright IBM Corporation 2008
AU548.0
Notes:
HWAT: node recovery
When the failed node reboots, AIX must be configured to leave the network cards
factory-defined hardware address in place. If AIX is configured to set the network cards
HW address to the alternate hardware address at boot time, then two NICs on the same
network have the same hardware address (weird things happen when you do this).
C-19
Student Notebook
bondar
hudson
AU548.0
Notes:
Hardware address takeover
In this scenario, we will implement HWAT to support the new computers discussed in
the visual.
Just imagine how much money they have saved once they realize that these new
computers dont do what the summer students need done!
In the meantime, it looks like we need to implement hardware address takeover to
support these FOOL-97Xs.
Reality check
A side note is probably in order: although most TCP/IP-capable systems respect
gratuitous ARP, there are strange devices out there that do not. This scenario is phony
but it presents a real if rather unlikely problem. For example, the ATM network does not
support gratuitous ARP and so could be a candidate for the use of HWAT.
C-20 HACMP Implementation
V4.0
Student Notebook
AP
AU548.0
Notes:
Implementing HWAT
To use HWAT, we must use IPAT via replacement.
C-21
Student Notebook
V4.0
Student Notebook
AP
Stopping HACMP
# smit clstop
Stop Cluster Services
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
now
[bondar,hudson]
true
graceful
+
+
+
+
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
AU548.0
Notes:
Stop HACMP
Make sure that HACMP is shut down gracefully, as we cant have the application
running while we are changing service IP addresses.
C-23
Student Notebook
+--------------------------------------------------------------------------+
xweb
yweb
zweb
F1=Help
F2=Refresh
F3=Cancel
F7=Select
F8=Image
F10=Exit
F1 Enter=Do
/=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Remove any service labels configure for IPAT via aliasing
An attempt to convert the network to IPAT via IP replacement fails if there are any
service IP labels that dont conform to the IPAT via IP replacement rules.
V4.0
Student Notebook
AP
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
net_ether_01
[]
[ether]
[255.255.255.0]
[No]
[]
F3=Cancel
F7=Edit
Enter=Do
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Introduction
Here we change the net_ether_01 network to disable IPAT via aliasing.
C-25
Student Notebook
bondar
bondar-if1
bondar-if2
hudson
hudson-if1
hudson-if2
xweb
192.168.15.70
yweb
#
#
#
#
#
#
#
#
#
#
AU548.0
Notes:
IPAT via replacement rules
Remember the rules for IP addresses for IPAT via IP replacement networks (slightly
reworded):
a. The service IP labels must all be on the same subnet.
b. There must be one NIC on each host that has an IP address on the same subnet as
the service IP labels (in HACMP 4.x terminology, these NICs are boot adapters).
c. The other NICs on each node must each be in a different subnet than the service IP
labels (in HACMP 4.x terminology, these NICs are standby adapters).
In a cluster with only two NICs per node, NIC IP addresses that conform to the IPAT via
IP aliasing rules also conform to the IPAT via replacement; so only the service IP labels
need to be changed.
V4.0
Student Notebook
AP
Creating a
locally administered address (LAA)
Each service IP label using HWAT will need an LAA
The LAA must be unique on the cluster's physical network
The MAC address based technologies (Ethernet, Token ring and
FDDI) use six byte hardware addresses of the form:
xx.xx.xx.xx.xx.xx
The factory-set MAC address of the NIC will start with 0, 1, 2 or 3
A MAC address that starts with 0, 1, 2 or 3 is called a Globally Administered Address
(GAA) because it is assigned to the NIC's vendor by a central authority
AU548.0
Notes:
Hardware addresses
Hardware addresses must be unique, at a minimum, on the local network to which they
are connected. The factory set hardware address for each network interface card (NIC)
is administered by a central authority and should be unique in the world. These
addresses are called Globally Administered Addresses (GAAs).
C-27
Student Notebook
AU548.0
Notes:
V4.0
Student Notebook
AP
AU548.0
Notes:
Issues
The main thing to remember is that you do NOT configure the ALTERNATE hardware
address field in the SMIT devices panel.
You must leave that blank and configure this using the SMIT HACMP menus.
C-29
Student Notebook
[Entry Fields]
[xweb]
+
net_ether_01
[4004ac171964]
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
F4=List
F8=Image
Don't forget to specify the second LAA for the second service IP label.
Copyright IBM Corporation 2008
AU548.0
Notes:
Redefining the service IP labels
Define each of the service IP labels making sure to specify a different LAA address for
each one.
The Alternate HW Address to accompany IP Label/Address is specified as a series
of hexadecimal digits without intervening periods or any other punctuation.
If IPAT via IP replacement is specified for the network, which it is in this case, you get an
error or a warning from this screen if you try to define service IP labels which do not
conform to the rules for service IP labels on IPAT via IP replacement networks.
V4.0
Student Notebook
AP
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
Synchronize
Dont forget to synchronize.
C-31
Student Notebook
Checkpoint
1. For IPAT via replacement (select all that apply)
a. Each service IP address must be in the same subnet as one of the
non-service addresses
b. Each service IP address must be in the same subnet
c. Each service IP address cannot be in any non-service address subnet
2.
True or False?
If the takeover node is not the home node for the resource group and
the resource group does not have a Startup policy of Online Using
Distribution Policy, the service IP address replaces the IP address of a
NIC with an IP address in the same subnet as the subnet of the
service IP address.
3.
True or False?
In order to use HWAT, you must enable and complete the
ALTERNATE ETHERNET address field in the SMIT devices menu.
4.
True or False?
You must stop the cluster in order to change from IPAT via aliasing to
IPAT via replacement.
Copyright IBM Corporation 2008
AU548.0
Notes:
V4.0
Student Notebook
AP
Unit summary
Key points from this unit:
IPAT via IP replacement:
May require fewer subnets than IPAT via aliasing
May require more NICs than IPAT via aliasing
Supports hardware address takeover
AU548.0
Notes:
C-33
Student Notebook
V4.0
Student Notebook
Uempty
References
SC23-5209-01 HACMP for AIX, Version 5.4.1: Installation Guide
SC23-4864-10 HACMP for AIX, Version 5.4.1:
Concepts and Facilities Guide
SC23-4861-10 HACMP for AIX, Version 5.4.1: Planning Guide
SC23-4862-10 HACMP for AIX, Version 5.4.1: Administration Guide
SC23-5177-04 HACMP for AIX, Version 5.4.1: Troubleshooting Guide
SC23-4867-09 HACMP for AIX, Version 5.4.1: Master Glossary
http://www-03.ibm.com/systems/p/library/hacmp_docs.html
HACMP manuals
D-1
Student Notebook
Unit objectives
After completing this unit, you should be able to:
Perform the steps necessary to configure Target Mode SSA
AU548.0
Notes:
D-2
HACMP Implementation
V4.0
Student Notebook
Uempty
hudson
D
AU548.0
Notes:
Target mode SSA or heartbeat over disk networks
Sadly, the premise behind this scenario is all too real. The problem with rs232 non-IP
networks is that if they become disconnected or otherwise disabled, then it is entirely
possible that nobody notices even though HACMP logs the failure of the connection
when it happens and reports it in the logs if it is down at HACMP startup time. In
contrast, a target mode SSA network or heartbeat on disk network wont fail until all
paths between the two nodes fail. Since such a failure will cause one or both nodes to
lose access to some or all of the shared disks, such a failure is MUCH less likely to go
unnoticed. We focus on SSA in this scenario as we have discussed heartbeat over disk
earlier in the course.
D-3
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
F3=Cancel
F7=Edit
Enter=Do
+#
F4=List
F8=Image
Use the smitty ssaa fastpath to get to AIX's SSA Adapters menu.
Copyright IBM Corporation 2008
AU548.0
Notes:
Required software
Target mode SSA support requires that the devices.ssa.tm.rte file set be installed on
all cluster nodes.
D-4
HACMP Implementation
V4.0
Student Notebook
Uempty
D-5
Student Notebook
bondar now also knows that hudson supports tmssa and has created
the tmssa devices (/dev/tmssa2.im and /dev/tmssa2.tm) which refer
to hudson
AU548.0
Notes:
Introduction
Once each node has a unique SSA node number, the AIX configuration manager needs
to be used to define the tmssa devices. Each node must have tmssa devices which
refer to each of the other nodes that they can see via the SSA loops. When cfgmgr is
run on a node, it sets up the node to accept tmssa packets, and it then defines tmssa
devices referring to any other nodes which respond to tmssa packets. In order for this to
all work, the other nodes must all be set up to accept and respond to tmssa packets.
Procedure
The end result is that the following procedure gets all the required tmssa devices
defined:
D-6
HACMP Implementation
V4.0
Student Notebook
Uempty
1. Run cfgmgr on each cluster node in turn. This sets up each node to handle tmssa
packets, and defines the tmssa devices on each node to refer to nodes which have
already been setup for tmssa.
2. Run cfgmgr on each node in turn again (depending upon exactly what order you do this
in, it is actually possible to skip running cfgmgr on one of the nodes, but it is probably
not worth the trouble of being sure that the last cfgmgr run wasnt required).
3. Verify the tmssar devices exist:
Run
# lsdev -C | grep tmssa
on each node. There should be a tmssar device (which is actually a target mode SSA
router acting as a pseudo device) configured on each node.
4. Verify the tmssa devices exist:
Run
# ls /dev/tmssa*
on each node. Note that each node has target mode SSA devices called
/dev/tmssa#.im and /dev/tmssa#.tm where # refers to the other nodes node number.
5. Test the target mode connection:
Enter the following command on the node with id 1 (make sure you specify the tm suffix
and not the im suffix):
# cat < /dev/tmssa2.tm
(This command should hang)
On the node with ID 2, enter the following command (make sure that you specify the im
suffix and not the tm suffix):
# cat /etc/hosts > /dev/tmssa1.im
(The /etc/hosts file should be displayed on the first node)
This validates that the target mode serial network in functional. Please note that any
text file may be substituted for /etc/hosts and you have to specify different tmssa
device names if you configured different SSA node numbers for each node. This is
simply an example.
D-7
Student Notebook
Extended Configuration
Move cursor to desired item and press Enter.
Discover
Extended
Extended
Extended
Extended
Security
Snapshot
F1=Help
F9=Shell
F2=Refresh
F10=Exit
F3=Cancel
Enter=Do
F8=Image
AU548.0
Notes:
HACMP discover
By discovering the new devices, they will appear in SMIT pick lists when we configure
the tmssa non-IP network. Strictly speaking, it is not necessary to rerun the HACMP
discovery as it is possible to configure tmssa networks by entering in the tmssa device
names explicitly. As this is a rather error-prone process, it is probably best to use the
HACMP discovery mechanism to discover the devices for us.
D-8
HACMP Implementation
V4.0
Student Notebook
Uempty
+--------------------------------------------------------------------------+
Select a category
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Defining a non-IP tmssa network
The procedure for defining a non-IP tmssa network is pretty much identical to the
procedure used earlier to define the non-IP rs232 network.
D-9
Student Notebook
+--------------------------------------------------------------------------+
Select a category
Communication Interfaces
Communication Devices
F1=Help
F2=Refresh
F3=Cancel
F8=Image
F10=Exit
Enter=Do
F1 /=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Move cursor to desired item and press F7. Use arrow keys to scroll.
# Node
Device
Device Path
Pvid
>
hudson
tmssa1
/dev/tmssa1
>
bondar
tmssa2
/dev/tmssa2
bondar
tty0
/dev/tty0
hudson
tty0
/dev/tty0
bondar
tty1
/dev/tty1
hudson
tty1
/dev/tty1
F1=Help
F2=Refresh
F3=Cancel
F7=Select
F8=Image
F10=Exit
F1 Enter=Do
/=Find
n=Find Next
F9+--------------------------------------------------------------------------+
AU548.0
Notes:
Final step
Select the tmssa devices on each node and press Enter to define the network.
Refer to Chapter 13 of the HACMP v5.3 Planning and Installation Guide for information
on configuring all supported types of non-IP networks.
D-11
Student Notebook
F1=Help
F5=Reset
F9=Shell
F2=Refresh
F6=Command
F10=Exit
[Entry Fields]
[Both]
[No]
[No]
[Standard]
F3=Cancel
F7=Edit
Enter=Do
+
+
+
+
F4=List
F8=Image
AU548.0
Notes:
V4.0
Student Notebook
Uempty
Unit summary
Key points from this unit:
This unit showed the steps necessary to configure Target
Mode SSA
AU548.0
Notes:
D-13
Student Notebook
V4.0
backpg
Back page