DB2 and SAP DR Using DS8300 Global Mirror V1.1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

IBM Americas Advanced Technical Support

DB2 and SAP Disaster Recovery using DS8300 Global Mirror

IBM Solutions Advanced Technical Support


Nasima Ahmad
Chris Eisenmann
Jean-Luc Degrenand
Mark Gordon
Mark Keimig
Damir Rubic

Version: 1.1
Date: December 8, 2009
Copyright IBM Corporation, 2009

IBM Americas Advanced Technical Support


1.

INTRODUCTION................................................................................................................................ 3

2.

TRADEMARKS................................................................................................................................... 3

3.

ACKNOWLEDGEMENTS ................................................................................................................ 3

4.

FEEDBACK ......................................................................................................................................... 3

5.

VERSIONS ........................................................................................................................................... 3

6.

WHY CONFIGURE HADR WITH GLOBAL MIRROR............................................................... 3

7.

CONFIGURATION............................................................................................................................. 4

8.

ONE TIME PREPARATION............................................................................................................. 6


DR LPAR ....................................................................................................................................... 7

8.1.
9.

VIO SERVERS NOTE ........................................................................................................................ 7

10.

PREPARATION FOR DR TEST................................................................................................... 7

11.

DR TEST EXECUTION ................................................................................................................. 9

11.1.
11.2.
11.3.
11.4.
11.5.
11.6.

DETACH PRACTICE COPY LUNS FROM VIOS ............................................................................. 9


COPY GLOBAL MIRROR TARGETS TO PRACTICE COPY VOLUMES ............................................... 10
ATTACH PRACTICE COPY LUNS TO VIOS................................................................................ 15
CONFIGURE THE DR LPAR FOR DB2 AND SAP RESTART......................................................... 17
DB2 RESTART........................................................................................................................... 23
SAP STARTUP ........................................................................................................................... 25

Copyright IBM Corporation, 2009


Page 2

IBM Americas Advanced Technical Support

1. Introduction
This paper documents a DR demonstration executed in a PoC for an IBM Customer. The configuration
we used was based on the customer requirements:
DR must use disk-based data copy process, not application or DB data copy process
Production site has DB2 HADR cluster
The DR site is mirrored from one of the DB2 HADR databases, and the process must work
whether the mirrored DB is currently HADR Primary or Standby
HA is automated, and DR is a declared event
The end-users must login to DR SAP using same SAPGUI configuration as used for Production
SAP
Order entry transaction workload is running at the time of simulated disaster
RPO and RTO Objectives
And also based on the following project constraints:
DR configuration and testing must be done quickly without impacting other PoC demonstrations
The two DS8300 Disk Systems for Global Mirror source and target are in the same data center
The LPARs for the production and DR LPARs run in the same CECs and share VIO servers
The configuration shown in this test is not offered as a best practice for DR in general, but was chosen to
demonstrate DR using DS8300 global mirror with the configuration requirements and constraints listed
above.

2. Trademarks
DB2, AIX, DS8000, Tivoli, TotalStorage are registered trademarks of IBM Corporation.
SAP is a registered trademark of SAP AG in Germany and in other countries

3. Acknowledgements
Thank you to Dale McInnis and Liwen Yeow for their guidance on configuring DR with DB2 HADR.

4. Feedback
Please send comments or suggestions to Mark Gordon ([email protected]).

5. Versions

Version 1.0 Original Version.


Version 1.1 added Sections 4, 5, and 6.

6. Why configure HADR with Global Mirror


IBM performed a proof of concept (PoC) project to demonstrate combining DB2 HADR with IBM Global
Mirror. This configuration offers both a local high availability solution (DB2 HADR) which can be
configured for mirroring committed transactions from a primary DB to a standby DB2 together with a DR
solution which can mirror the productive DB to a remote location.
Copyright IBM Corporation, 2009
Page 3

IBM Americas Advanced Technical Support


Recovery with HADR can be automated with Tivoli System Automation for MultiPlatforms to enable a
recovery time of seconds. Using HADR SYNC or NEARSYNC modes for synchronous propagation of
committed changes to the standby, the HADR standby system can takeover with no loss of committed data
in the event of a software or hardware failure on the primary system. In addition, HADR can be used to
allow concurrent software maintenance on one DB2 while the other continues to support productive
workload. SAP application servers can be configured to automatically reconnect in case of a failure or
planned DB change, and login sessions are not disconnected by takeover.
Global Mirror asynchronously copies data from one DS8000 storage system to another, and automatically
creates consistency groups (where all target volumes are consistent at the same point in time) many times
per hour. These consistent sets of volumes can be used to restart the workload at the remote site in case
of a disaster. Since the consistency group is a point-in-time consistent copy of a running system, the DR
system is started by doing a crash recovery to roll-back uncommitted changes. Recovery with Global
Mirror can be automated, but was not in our demonstration, due to customer requirements.
Together HADR and Global Mirror support exceptionally high application availability. The application
is protected (with very fast recovery) from hardware and software failures on the database server, HADR
offers the opportunity to do concurrent maintenance (such as AIX or DB2 software fixes) while the
system remains available to end-users, and Global Mirror enables recovery at a remote site, in case of a
disaster which prevents the workload from running at the production site.

7. Configuration
Figure 1 summarizes the HADR and DR configuration used in the test.

Figure 1: DR architecture with Global Mirror from DB in HADR cluster

Global Mirror with Practice Copy was configured to mirror the LUNS of a DB in an HADR cluster
(hadrdba) to LUNS on a second DS8300 system. In addition, the SAP filesystems (/sapmnt,
Copyright IBM Corporation, 2009
Page 4

IBM Americas Advanced Technical Support


/usr/sap/<SID>/) were mirrored via Global Mirror with Practice Copy from the application server
hadrappsrvra to LUNS on the second DS8300 system. All mirrored LUNs were in the same consistency
group.
In our normal HADR configuration, hadrdba was the standby. We used NEARSYNC for HADR. Doing
global mirror from the standby reduces the bandwidth requirement for the global mirror session, and
removes any potential performance impact of global mirror on the production (Primary) DB. But since
the primary role can move back and forth in a HADR cluster, we tested recovery when hadrdba had been a
standby HADR DB, and when it had been a primary HADR DB.
The process in this document takes a flashcopy while the HADR DBs are up, so that the Global Mirror
target copy will be an in-flight copy, and look to DB2 when it starts on the DR LPAR like a crashed
system.
In a real disaster, after the disaster had been declared, TPC-R would be used to suspend the global mirror
relationship for the target volumes, and then TPC-R would do a recover operation to copy the global
mirror target volumes to the practice volumes.
For our tests, we did not suspend the global mirror relationship, in order to preserve the mirror
synchronization between Production and global mirror target LUNs.

Figure 2: HADR cluster with SAP nodes

As shown in Figure 2, the SAP ASCS ran as a two node Tivoli SA MP cluster (ASCS with enqueue
replication) on hadrascsa and hadrascsb. The SAP central instance ran on hadrappsrvra. NFS file
services were also provided by hadrappsrvra for SAP shared filesystems and DB2 shared archive log
filesystem. Tivoli SA MP controlled the DB2 HADR cluster on hadrdba and hadrdbb.
Copyright IBM Corporation, 2009
Page 5

IBM Americas Advanced Technical Support


As shown in Figure 3, the HADR and DR LPARs shared the same two CECs. Since each of the LPARs
is functionally independent this was suitable for our PoC, but of course in a real DR configuration the DR
LPARs and DR Storage Systems would be in a remote location.

Figure 3: LPAR and CEC configuration

8. One Time Preparation


As mentioned previously, global mirror with practice copy was configured for all the LUNs attached to
hadrdba and hadrappsrvra.
We used a local rootvg in the DR LPAR, rather than using the mirrored copy of rootvg from hadrdba.
The advantages of this configuration are:
One can boot the DR LPAR at any time
The DR LPAR has its own IP address and hostname
SA MP does not automatically start (as it would if we used a copy of the hadrdba LPAR, since SA
MP automatically started on boot on the production DB servers).
We kept a clean copy of rootvg, and copied it on each DR test so we had a known starting point.

Copyright IBM Corporation, 2009


Page 6

IBM Americas Advanced Technical Support


The disadvantages of local rootvg in the DR LPAR are:
It is necessary to copy configuration files and user configuration from rootvg in the hadr DB
servers to prepare the DR environment for testing
It is necessary to manually keep the AIX software in synch between the production HADR cluster
and the DR LPAR.
In order to maintain a consistent copy of the database at the global mirror target location, there are DB2
configuration and DS8300 software requirements. Please see DB2 documentation for the latest
recommendations for DB2 configuration when using Global Mirror.

8.1.

DR LPAR

As part of the preparation of the local rootvg in the DR LPAR, several steps were needed to copy
configuration over from the production HADR cluster:
Create sap and db2 users on DR LPAR with same UID and GID as on hadrdba/hadrdbb.
Set passwords for users, and remove ADMCHG flag for users from /etc/security/passwd.
Copy home directories for sap and db2 users to DR system.
Copy /etc/services from hadrdba
Create script import_for_dr.sh (v. Figure 21) which uses PVIDs to import the volume groups
needed for DB2 and SAP on DR.
Create fsck.sh which performs fsck on all imported filesystems, before they are mounted.
Create ifconfig_aliases.sh to set alias IP addresses for source systems on DR Ethernet interface.

9. VIO Servers Note


With our test configuration, where the VIO servers are shared by the DR LPARs and other LPARs and
where flashcopy is repeatedly used in testing the DR LPAR, we needed a process to ensure that there
was not cached data in the VIOS that was out of synch with the newly flashed data on the DR LPAR
LUNs. If we either 1) used a VIOS dedicated to DR LPAR, or 2) did not use VIOS, then we would
not have needed the process described in Section 11 to detach and attach DR LUNs to the VIOS. If
using a VIOS dedicated for DR, we could have booted the dedicated VIOS after the flashcopy to DR
volumes. Or if using LUNS directly attached to the DR LPAR, we would have booted the DR LPAR
after the flashcopy to DR volumes.

10. Preparation for DR Test


Before each test of the DR process, the boot disk of the DR LPAR is refreshed hdisk0 is the
reference copy, which is copied to hdisk1 on each test.
Boot off hdisk0 (bootlist m normal hdisk0 ; reboot )
exportvg altinst_rootvg (get rid of altinst_rootvg on hdisk1)
Use smitty alt_clone to clone hdisk0 to hdisk1
Check that hdisk1 is now the boot disk (bootlist m normal o )
Reboot DR LPAR to boot off hdisk1
Verify that the HADR cluster, SA MP cluster for ASCS and SAP Central Instance are operating.
Copyright IBM Corporation, 2009
Page 7

IBM Americas Advanced Technical Support


Since DR of an active SAP system was a requirement, we start a loadrunner workload running 2500
sales orders per hour that will execute during the test. We also will start a SAPGUI session, to verify
that login sessions are not terminated while the application server reconnects to the Primary DB
server.

Copyright IBM Corporation, 2009


Page 8

IBM Americas Advanced Technical Support

11. DR Test Execution


11.1.

Detach Practice Copy LUNs from VIOS


Make sure that DR LPAR is not activated on p570A.
Log in to both VIO servers on p570A (viosa1 and viosa2). You will need to be on the system
as root.
login: padmin
$ oem_setup_env this gets you to root.
Change directory to directory where scripts are located and execute script to remove the disks.
MAKE sure you do this on both VIO servers.
# cd /poc_directory
# ./drdiskremoval.ksh see Figure 4 below - detaches DR disks from VIO servers
Once disks are removed on both servers, you are ready to do the FLASHCOPY of the disks.

Figure 4: drdiskremoval.sh

Copyright IBM Corporation, 2009


Page 9

IBM Americas Advanced Technical Support

11.2.

Copy Global Mirror targets to practice copy volumes

Again, this is a test process which was used in order to not suspend the global mirror session between
the source and target volumes.

Login to the IBM TotalStorage Productivity Center for Replication using the following http address:
o https://tpcr.ip.add.ress:3443/CSM
o Login: Administrator
o if logon was successful, you should be presented with the following screen where Sessions
link should be selected

Figure 5: TPC-R flashcopy preparation one


Select the CoGM Session by clicking the radio button, as shown in Figure 6.

Figure 6: TPC-R flashcopy preparation two


After selection, choose Flash activity from the pull-down menu as shown in Figure 7.

Copyright IBM Corporation, 2009


Page 10

IBM Americas Advanced Technical Support

Figure 7: TPC-R flashcopy preparation three

Figure 8: TPC-R flashcopy preparation four

Figure 9 shows how to initiate the FlashCopy process between I2 (Target volumes) and H2 (Practice
volumes). Click on the yes to continue

Copyright IBM Corporation, 2009


Page 11

IBM Americas Advanced Technical Support

Figure 9: TPC-R flashcopy preparation five


FlashCopy process phases are presented in the following snapshots (Flashing -> Preparing -> Prepared)

Figure 10: TPC-R flashcopy preparation six

In Figure 10, flashcopy is flashing

Copyright IBM Corporation, 2009


Page 12

IBM Americas Advanced Technical Support

Figure 11: TPC-R flashcopy preparation seven

Figure 11 shows the next step is preparing, then as shown in Figure 12 prepared.

Figure 12: TPC-R flashcopy preparation eight

Copyright IBM Corporation, 2009


Page 13

IBM Americas Advanced Technical Support


Successful FlashCopy process can be confirmed as shown in the following group of the snapshots

Figure 13: TPC-R flashcopy check one

Figure 14: TPC-R flashcopy check two

The timestamp on the H2 <- I2 role pair is updated when the flash is done.

Copyright IBM Corporation, 2009


Page 14

IBM Americas Advanced Technical Support

Figure 15: TPC-R flashcopy check three


And if we check the volume pairs in H2<-I2, we see that they are all in the state target available. Once
process is finished, logout of TPC-R.

11.3.

Attach Practice Copy LUNs to VIOS


After FLASHCOPY has completed, you need to add the disks back to both VIO servers BEFORE you
can bring up the DR LPAR
Log in to both VIO servers on p570A (viosa1 and viosa2). You will need to be on the system as root.
login: padmin
$ oem_setup_env <-- this gets you to root
BEFORE you can run the drdiskadd.sh script below, you must run config manager twice to get the disks
back on the VIO server. Check to be sure you have all (4) paths for the disks. MAKE sure you do this
on both VIO servers.
o # cfgmgr
o # cfgmgr
o # pcmpath query device <--- you only need to see the end of the query that there are (4)
paths.

Figure 16: pcmpath query

Copyright IBM Corporation, 2009


Page 15

IBM Americas Advanced Technical Support

Change directory to directory where scripts are located and execute script to add the disks. MAKE
sure you do this on both VIO servers.
# cd /poc_directory
# ./drdiskadd.ksh (shown in Figure 17)

Figure 17: drdiskadd.sh

BEFORE you bring up the DR lpar, since we are simulating a disaster and will reuse the IP addresses
of the production SAP and HADR systems in the DR LPAR, you must shutdown all the other HADR
lpars (that is the LPARs for the source HADR DB2 SAP system) on p570A. Shutdown the production
lpars for hadr primary, standby, ASCS, and SAP application server. They are hadrdba,
hadrappsrvra, and hadrascs. From the HMC, get a console window for each and perform a graceful
shutdown. hadrdba needs to go down first, then hadrascs because they are NFS clients of
hadrappsrvra.
# shutdown -F

Copyright IBM Corporation, 2009


Page 16

IBM Americas Advanced Technical Support

11.4.

Configure the DR LPAR for DB2 and SAP restart

After boot of the DR LPAR, use lspv to display the disks. Note that initially, all the PVs copied
from the source DB2 and SAP systems are not in VGs, since the VGs are not configured in the
baseline rootvg in the DR LPAR.

Figure 18: initial status of hdisks on DR LPAR

On the Global Mirror source system (hadrdba) we determined the VG associated with each PV. This
information was used to create import_for_dr.sh shown in Figure 21, which imports the volume
groups, so that the same volume group names are used on the DR system that were used on the source
system, hadrdba.

Figure 19: hadrdba lspv

Copyright IBM Corporation, 2009


Page 17

IBM Americas Advanced Technical Support


Likewise, we determined the pvids for the volume groups on the SAP application server, hadrappsrvra,
since we will also import the sapvg and archvg VGs on the practice copy volumes and start the app
server instance on the DR LPAR.

Figure 20: hadrappsrvra lspv

Figure 21 is the script run on the DR LPAR to import the volume groups into the DR LPAR.

Figure 21: import_for_dr.sh

Copyright IBM Corporation, 2009


Page 18

IBM Americas Advanced Technical Support


On the DR LPAR as root, import the VGs - execute import_for_dr.sh:

Figure 22: Execution of import_for_dr.sh

Note the message about loglv00 on hdisk27 was changed. As part of testing this process, we check
the LVs defined on hdisk27, and the /etc/filesystems file, to make ensure that there are no filesystems
that reference loglv00 from hdisk27. This check is not included here.

Copyright IBM Corporation, 2009


Page 19

IBM Americas Advanced Technical Support


After the import, we display the PVs and volume groups with lspv. Since the DR LPAR does not use
al the PVs copied from hadrdba and hadrappsrvra (e.g. we dont import the source rootvgs), some PVs
will remain not configured in any VG.

Figure 23: lspv after import of DB2 VGs

Copyright IBM Corporation, 2009


Page 20

IBM Americas Advanced Technical Support


Since the source system crashed, we run fsck on the filesystems before mounting them, to be on the
safe side. Script fsck.sh was created for this.

Figure 24: fsck.sh

After running fsck.sh, check the log that it created - /tmp/fsck.out.

Figure 25: Check fsck results

A few of the mount points in the VGDAs were not correct for the DR system configuration, so we
changed them before mounting the filesystems. Figure 26 is the script that makes these changes.

Figure 26: chfs.sh

Run chfs.sh.
Mount /usr/sap/PRD and create subdir for mount point:

Figure 27: mount /usr/sap/PRD

Copyright IBM Corporation, 2009


Page 21

IBM Americas Advanced Technical Support


Manually mount the /db2 filesystem before the rest of the DB2 filesystems, in case the importvg put it
out of order in /etc/filesystems.

Figure 28: mount /db2 filesystem

Use mount a to mount the rest of the filesystems. Note that the 0506-324 error message is normal
for filesystems that are already mounted.

Figure 29: mount a

Use df command to check that all the necessary filesystems are mounted.

Figure 30: df to check all filesystems are ready

Since this system uses a local rootvg, and has its own hostname, we will change the hostname from
DR to be the same as the copied system - hadrdba. We did this to restart DB2 without making
Copyright IBM Corporation, 2009
Page 22

IBM Americas Advanced Technical Support


additional DB2 changes. The IP address associated with hadrdba will be added as an alias in a
subsequent step. Use smitty hostname to change hostname.

Figure 31: smitty hostname

This DR demonstration takes functions that ran on three different LPARs (ASCS, Application Server,
and DB Server) and consolidates them together into one LPAR. In order for these three functions to
run without additional configuration changes, we set aliases for all the LPARs on the Ethernet
interface. We created a script, ifconfig_aliases.sh to do this. The interface names hadrdba, hadrdb,
ascs, and hadrascsa were already defined in the hosts file in the DR LPAR.

Figure 32: ifconfig_aliases.sh

Execute ifconfig_aliases.sh, and display the aliases on the interface.

Figure 33: display en0 aliases

11.5.

DB2 Restart

We su to db2prd, to do the steps to startup DB2. Start DB2 with db2start. Since this is a copy of half
of a HADR cluster system, we get SQL5043N, which is normal.

Figure 34: db2start with SQL5043N message

Copyright IBM Corporation, 2009


Page 23

IBM Americas Advanced Technical Support


After starting DB2, use db2haicu delete to remove SA MP and cluster configuration.

Figure 35: db2haicu delete

Stop and start D2. Note that the SQL5043N message is now gone.

Figure 36: restart DB2 after db2haicu delete

Since hadrdba might be either Primary or Standby role at the time of the failure, and there are different
restart actions depending on the role, we check the DB config for the PRD DB.

Figure 37: Check Role of DB and if Rollforward pending

In this example, hadrdba was the standby DB, and so this copy is in Rollforward pending.
Since we have a single DR DB, we stop HADR db2 stop hadr on db PRD. Then, verify that the
role is now changed to STANDARD.

Figure 38: DB2 stop hadr

Copyright IBM Corporation, 2009


Page 24

IBM Americas Advanced Technical Support


Since the role on the source DB was Standby there are two additional commands that are needed,
since the Standby DB was in Rollforward pending.
db2 rollforward db prd to end of logs
db2 rollforward db prd complete

Figure 39: db2 rollforward for standby copy

Now, activate DB PRD so we can start SAP.

11.6.

SAP Startup

The next steps are done with the <sid>adm (here prdadm) userid on the DR LPAR (which now has the
hadrdba hostname).
Start the ABAP Central Services Instance with startsap ascs. Note that ascs is the alias of the IP
address on which the ASCS runs. The ifconfig_aliases.sh script shown in Figure 32 added the ascs
alias to the en0 interface.

Copyright IBM Corporation, 2009


Page 25

IBM Americas Advanced Technical Support

Figure 40: startsap ascs

Next, start the SAP Application Server instance with startsap hadrappsrvra DVEBMGS00. This will
start the DVEBMGS00 instance which originally ran on hadrappsrvra. The ifconfig_aliases.sh script
shown in Figure 32 added the hadrappsrvra alias to en0.

Figure 41: startsap hadrdba DVEBMGS00

Copyright IBM Corporation, 2009


Page 26

IBM Americas Advanced Technical Support


And, we can login with SAPGUI using the same SAPLOGON configuration as originally used to
login to PRD, and the DR test is done.

Figure 42: SAPGUI login to DR system

Copyright IBM Corporation, 2009


Page 27

IBM Americas Advanced Technical Support


Figure 1: DR architecture with Global Mirror from DB in HADR cluster ................................................... 4
Figure 2: HADR cluster with SAP nodes ...................................................................................................... 5
Figure 3: LPAR and CEC configuration........................................................................................................ 6
Figure 4: drdiskremoval.sh ............................................................................................................................ 9
Figure 5: TPC-R flashcopy preparation one ................................................................................................ 10
Figure 6: TPC-R flashcopy preparation two................................................................................................ 10
Figure 7: TPC-R flashcopy preparation three.............................................................................................. 11
Figure 8: TPC-R flashcopy preparation four ............................................................................................... 11
Figure 9: TPC-R flashcopy preparation five................................................................................................ 12
Figure 10: TPC-R flashcopy preparation six ............................................................................................... 12
Figure 11: TPC-R flashcopy preparation seven........................................................................................... 13
Figure 12: TPC-R flashcopy preparation eight............................................................................................ 13
Figure 13: TPC-R flashcopy check one ....................................................................................................... 14
Figure 14: TPC-R flashcopy check two....................................................................................................... 14
Figure 15: TPC-R flashcopy check three..................................................................................................... 15
Figure 16: pcmpath query ............................................................................................................................ 15
Figure 17: drdiskadd.sh................................................................................................................................ 16
Figure 18: initial status of hdisks on DR LPAR .......................................................................................... 17
Figure 19: hadrdba lspv ............................................................................................................................... 17
Figure 20: hadrappsrvra lspv ....................................................................................................................... 18
Figure 21: import_for_dr.sh......................................................................................................................... 18
Figure 22: Execution of import_for_dr.sh ................................................................................................... 19
Figure 23: lspv after import of DB2 VGs .................................................................................................... 20
Figure 24: fsck.sh......................................................................................................................................... 21
Figure 25: Check fsck results....................................................................................................................... 21
Figure 26: chfs.sh......................................................................................................................................... 21
Figure 27: mount /usr/sap/PRD ................................................................................................................... 21
Figure 28: mount /db2 filesystem ................................................................................................................ 22
Figure 29: mount a ..................................................................................................................................... 22
Figure 30: df to check all filesystems are ready .......................................................................................... 22
Figure 31: smitty hostname.......................................................................................................................... 23
Figure 32: ifconfig_aliases.sh ...................................................................................................................... 23
Figure 33: display en0 aliases ...................................................................................................................... 23
Figure 34: db2start with SQL5043N message ............................................................................................. 23
Figure 35: db2haicu delete......................................................................................................................... 24
Figure 36: restart DB2 after db2haicu delete............................................................................................. 24
Figure 37: Check Role of DB and if Rollforward pending.......................................................................... 24
Figure 38: DB2 stop hadr............................................................................................................................. 24
Figure 39: db2 rollforward for standby copy............................................................................................... 25
Figure 40: startsap ascs ................................................................................................................................ 26
Figure 41: startsap hadrdba DVEBMGS00 ................................................................................................. 26
Figure 42: SAPGUI login to DR system...................................................................................................... 27

Copyright IBM Corporation, 2009


Page 28

You might also like