8.information System Controls: 8.3 The Disaster Recovery Plan

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

8.

Information System Controls

8.3 The Disaster Recovery Plan


Disaster Recovery Plan (DRP)
• A disaster Recovery Plan (DRP) is a documented process or set of
procedures to recover and protect an information system in the 
event of a disaster.
• Such a plan, ordinarily documented in written form, specifies
procedures an organization is to follow in the event of a disaster.
•  It is a comprehensive statement of consistent actions to be taken
before, during and after a disaster".
• The disaster could be natural or man-made. 
• Given organizations' increasing dependency on information 
systems  to run their operations, a DRP, sometimes erroneously 
called a Continuity of Operations Plan (COOP), is increasingly 
associated with the recovery of information systems data, assets, 
and facilities.
Objectives of a DRP
• Organizations cannot always avoid disasters, but with careful 
planning the effects of a disaster can be minimized. 
• The objective of a DRP is 
– to minimize downtime
– to minimize data loss.
• The primary objective is to protect the organization in the event that 
ALL or part of its operations and/or computer services are rendered
unusable. 
• The Disaster Recovery Plan 
– Minimizes the disruption of operations
–  Ensures that there is some level of organizational stability
– Ensures an orderly recovery after a disaster will prevail.
• Minimizing downtime and data loss is measured in terms of two 
concepts: the Recovery Time Objective (RTO) and the Recovery Point
Objective (RPO).
Recovery Time Objective (RTO)

• The RTO is the time within which a business


process must be restored, after a major 
incident (MI) has occurred, in order to avoid
unacceptable consequences associated with a 
break in business continuity. 
Recovery Point Objective (RPO).
• The RPO is the age of files that must be recovered
from backup storage for normal operations to
resume if a computer, system, or network goes down 
as a result of a Major Incident. 
• The RPO is expressed backwards in time (that is, into 
the past) starting from the instant at which the MI 
occurs, and can be specified in seconds, minutes,
hours, or days.
• The RPO is the maximum acceptable amount of data
loss measured in time. 
Benefits of a DRP

• Like every insurance plan, there are benefits that can 
be obtained from the drafting of a disaster recovery 
plan. Some of these benefits are:
1. Providing a sense of security
2. Minimizing risk of delays
3. Guaranteeing the reliability of standby systems
4. Providing a standard for testing the plan
5. Minimizing decision-making during a disaster
6. Reducing potential legal liabilities
7. Lowering unnecessarily stressful work environment
Types of plans
• There is no one right type of disaster recovery 
plan, nor is there a one-size-fits-all disaster 
recovery plan. 
• However, there are three basic strategies that 
feature in all disaster recovery plans: 
1. Preventive Measures
2. Detective Measures
3. Corrective Measures.
1. Preventive Measures

• Preventive measures will try to prevent a disaster from


occurring. 
• These measures seek to identify and reduce risks. 
• They are designed to mitigate or prevent an event from 
happening. 
• These measures may include 
– keeping data backed up and off site,
–  using surge protectors, 
– installing generators
– conducting routine inspections. 
2. Detective Measures
• Detective measures are taken to discover the presence
of any unwanted events within the IT infrastructure. 
• Their aim is to uncover new potential threats.
• They may detect or uncover unwanted events.
• These measures include 
– installing fire alarms, 
– using up-to-date antivirus software, 
– holding employee training sessions,
–  and installing server and network
monitoring software. 
3. Corrective Measures
• Corrective measures are aimed to restore a 
system after a disaster or otherwise unwanted 
event takes place. 
• These measures focus on fixing or restoring the 
systems after a disaster. 
• Corrective measures may include 
– keeping critical documents in the Disaster Recovery 
Plan 
– or securing proper insurance policies, after a "lessons 
learned" brainstorming session.
Developing a DRP
• A disaster recovery plan must answer at least 
three basic questions: 
1. What is its objective and purpose, 
2. Who will be the people or teams who will be 
responsible in case any disruptions happen, and
3. What will these people do (the procedures to be 
followed) when the disaster strikes.
Developing a DRP (2)
• The entire process involved in developing a Disaster Recovery 
Plan consists of 10 steps:
1. Obtaining top management commitment
2. Establishing a planning committee
3. Performing a risk assessment
4. Establishing priorities for processing and operations
5. Determining recovery strategies
6. Collecting data
7. Organizing and documenting a written plan
8. Developing testing criteria and procedures
9. Testing the plan
10. Obtaining plan approval
1. Obtaining Top Management Commitment

• For a DRP to be successful, the central responsibility for 
the plan must reside on top management. 
• Management is responsible for coordinating the DRP
and ensuring its effectiveness within the organization. 
• It is also responsible for allocating adequate time and 
resources required in the development of an effective 
plan. 
• Resources that management must allocate include 
both financial considerations and the effort of all
personnel involved.
2. Establishing a Planning Committee (PC)

• A PC is appointed to oversee the development 
and implementation of the plan.
•  The PC includes representatives from all 
functional areas of the organization. 
• Key PC members customarily include the 
operations manager and the data processing
manager. 
• The PC also defines the scope of the plan.
3. Performing a Risk Assessment
• The PC prepares a risk analysis and a business impact analysis (BIA) 
that includes a range of possible disasters, including natural, 
technical and human threats.
•  Each functional area of the organization is analyzed to determine the 
potential consequence and impact associated with several disaster
scenarios. 
• The risk assessment process also evaluates the safety of critical 
documents and vital records.
•  Traditionally, fire has posed the greatest threat to an organization. 
Intentional human destruction, however, should also be considered.
• A thorough plan provides for the “worst case” situation: destruction 
of the main building. It is important to assess the impacts and 
consequences resulting from loss of information and services. 
• The PC also analyzes the costs related to minimizing the potential 
exposures.
4. Establishing Priorities for
Processing and Operations
• At this point, the critical needs of each department within the organization are evaluated
in order to prioritize them.
• Establishing priorities is important because no organization possesses infinite resources 
and criteria must be set as to where to allocate resources first. 
• Some of the areas often reviewed during the prioritization process are functional
operations, key personnel and their functions, information flow, processing systems 
used, services provided, existing documentation, historical records, and the 
department's policies and procedures.
• Processing and operations are analyzed to determine the maximum amount of time 
that the department and organization CAN OPERATE WITHOUT each critical system. This 
will later get mapped into the Recovery Time Objective.
•  A critical system is defined as that which is part of a system or procedure necessary to 
continue operations should a department, computer centre, main facility or a 
combination of these be destroyed or become inaccessible.
• A method used to determine the critical needs of a department is to document all the
functions performed by each department. Once the primary functions have been 
identified, the operations and processes are then RANKED in order of priority: essential, 
important and non-essential.
5. Determining Recovery Strategies
• During this phase, the most practical alternatives for processing in case of a disaster 
are researched and evaluated.
•  ALL ASPECTS of the organization are considered, including physical
facilities, computer hardware and software, communications links, data
files and databases, customer services provided, user operations, the 
overall management information systems (MIS) structure, end-user systems, and 
any other processing operations.
• Alternatives, dependent upon the evaluation of the computer function, may include: 
the provision of more than one data centre, the installation and deployment of
multiple computer system, duplication of service centre, consortium arrangements, 
lease of equipment, and any combinations of the above.
• Written agreements for the alternatives selected are prepared, specifying contract
duration, termination conditions, system testing, cost, any special security
procedures, procedure for the notification of system changes, hours of operation, 
the specific hardware and other equipment required for processing, personnel
requirements, definition of the circumstances constituting an emergency, priorities, 
and other contractual issues.
6. Collecting Data
• In this phase, data collection takes place. Among the 
recommended data gathering materials and documentation 
often included are various lists (critical telephone numbers list, 
master call list, master vendor list, notification checklist), 
inventories (communications equipment, documentation, office
equipment, forms, insurance policies, data centre computer
hardware, office supply, off-site storage location equipment, 
telephones, etc.), software and data files backup/retention
schedules, temporary location specifications, any other such 
other lists, materials, inventories and documentation. 
• Pre-formatted forms are often used to facilitate the data 
gathering process.
7.Organizing and Documenting a Written Plan
• Next, an outline of the plan’s contents is prepared to guide the
development of the detailed procedures. The Top management then 
reviews and approves the proposed plan. 
• It is during this phase that the actual written plan is then developed in its 
entirety, including all detailed procedures to be used before, during, and 
after a disaster. 
• The procedures include methods for maintaining and updating the plan 
to reflect any significant internal, external or systems changes.
• The procedures allow for a regular review of the plan by key personnel 
within the organization.
• The disaster recovery plan is structured using a team approach. Specific 
responsibilities are assigned to the appropriate team for each functional 
area of the organization. Teams responsible for administrative functions, 
logistics, user support, computer backup, restoration and other
important areas in the organization are identified.
8. Developing Testing
Criteria and Procedures
• Best practices dictate that DRPs be thoroughly tested and evaluated on 
a regular basis (at least annually). 
• Thorough DR plans include documentation with the procedures for
testing the plan. 
• The tests will provide the organization with the assurance that all 
necessary steps are included in the plan. Other reasons for testing 
include:
Determining the feasibility and compatibility of backup facilities and 
procedures.
Identifying areas in the plan that need modification.
Providing training to team managers and team members.
Demonstrating the ability of the organization to recover.
Providing motivation for maintaining and updating the disaster recovery plan.
9. Testing the Plan

• After testing procedures have been completed, an initial "dry run"
of the plan is performed by conducting a structured walk-through
test.
• The test will provide additional information regarding any further
steps that may need to be included, changes in procedures that
are not effective, and other appropriate adjustments. These may 
not become evident unless an actual dry-run test is performed. 
• The plan is subsequently updated to correct any problems
identified during the test.
•  Initially, testing of the plan is done in sections and after normal
business hours to minimize disruptions to the overall operations of 
the organization. As the plan is further polished, future tests occur
during normal business hours.
10. Obtaining Plan Approval

• Once the DRP has been written and tested, 
the plan is then submitted to management for 
approval.
•  It is top management’s ultimate responsibility 
that the organization has a documented and 
tested plan.
Common Mistakes Organizations Make
in Coming up with DRPs
• Due to its high cost, disaster recovery plans are 
not without critics. Cormac Foster has identified 
five "common mistakes" organizations often 
make related to disaster recovery planning:
1. Lack of buy-in
2. Incomplete RTOs and RPOs
3. Systems Oversight
4. Lax security
5. Outdated plans
1. Lack of buy-in
• One factor is the perception by executive 
management that DR planning is "just another fake 
earthquake drill" or CEOs that fail to make DR
planning and preparation a PRIORITY, are often 
significant contributors to the failure of a DR plan.
2. Incomplete RTOs and RPOs

• Another critical point is failure to include each and every


important business process or a block of data.
• "Every item in your DR plan requires a Recovery Time 
Objective (RTO) defining maximum process downtime or a 
Recovery Point Objective (RPO) noting an acceptable
restore point.
•  Anything less creates ripples that can extend the disaster's 
impact." As an example, "payroll, accounting and the
weekly customer newsletter may not be mission-critical in
the first 24 hours, but left alone for several days, they can 
become more important than any of your initial problems."
3. Systems Oversight

• A third point of failure involves focusing only on DR
without considering the larger business continuity needs: 
"Data and systems restoration after a disaster are 
essential, BUT every business process in your 
organization will need IT support, and that support 
requires planning and resources.“
•  As an example, corporate office space lost to a disaster 
can result in an instant pool of teleworkers which, in turn, 
can overload a company's VPN overnight, overwork the 
IT support staff at the blink of an eye and cause serious 
bottlenecks and monopolies with the dial-in PBX system.
4. Lax security

• When there is a disaster, an organization's data and business processes
become vulnerable. 
• As such, security can be more important than the raw speed involved in 
a disaster recovery plan's RTO.
•  The most critical consideration then becomes securing the new data
pipelines: from new VPNs to the connection from offsite backup
services.
• Another security concern includes documenting every step of the
recovery process—something that is especially important in highly
regulated industries, government agencies, or in disasters requiring
post-mortem forensics.
• Locking down or remotely wiping lost handheld devices is also an area 
that may require addressing.
5. Outdated Plans
• Another important aspect that is often overlooked involves 
the frequency with which DR Plans are updated. 
• Yearly updates are recommended but some industries or 
organizations require more frequent updates because 
business processes evolve or because of quicker data growth.
•  To stay relevant, disaster recovery plans should be an
integral part of all business analysis processes, and should 
be revisited at every major corporate acquisition, at every 
new product launch and at every new system development 
milestone.

You might also like