FNAL Capacity Management Policy Process Procedures
FNAL Capacity Management Policy Process Procedures
FNAL Capacity Management Policy Process Procedures
Client: Fermilab
Date : 03/04/2010
Version : 0.6
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
GENERAL
This document establishes a Capacity Management process and procedures for
Description
Fermilab.
The purpose of this process is to establish a Capacity Management process for the
Fermilab Computing Division. Adoption and implementation of this process provides a
structured method to ensure that the required capacity exists within the IT environment so
that IT Services meet business requirements as documented in Service Level
Purpose Agreements, and that this is provided in a cost-effective and timely manner.
Note: The Capacity process can be triggered by many other processes. In the normal
course of business, each service has a pre-determined capacity review cycle (usually
annually, to coincide with the budget cycle), and the process executes according to that
cycle.
Supersedes N/A
VERSION HISTORY
Version Date Author(s) Change Summary
0.1 12/28/2009 David Cole - Plexent Initial Draft Version
0.2 01/05/2010 David Cole - Plexent Added newly discovered information
0.3 01/26/2010 David Cole - Plexent Incorporated feedback from Workshop
0.4 02/09/2010 David Cole - Plexent Updates as a result of Core Team Review
0.5 02/17/2010 David Cole - Plexent Updates as a result of General CD Review
0.6 03/04/2010 David Cole - Plexent Further updates as a result of General CD Review
Page 1 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
TABLE OF CONTENTS
General.................................................................................................................................. 1
Introduction............................................................................................................................ 3
General Introduction.........................................................................................................3
Document Organization....................................................................................................3
Capacity Management Policies..............................................................................................4
Capacity Management Process Flow.....................................................................................5
Capacity Management General Notes..............................................................................6
Capacity Management Process Roles & Responsibilities.................................................6
Capacity Management Process Measurements...............................................................7
Capacity Management Critical Success Factors..............................................................8
Capacity Management Process Relationships.................................................................9
Manage Business Capacity Requirements Procedure Flow.................................................12
Manage Business Capacity Requirements Procedures Rules........................................13
Manage Business Capacity Requirements Procedure Narrative....................................13
Verification..................................................................................................................... 15
Management Review Criteria.........................................................................................15
Escalation Criteria.......................................................................................................... 15
Risks.............................................................................................................................. 15
Manage Service Capacity Requirements Procedure Flow...................................................16
Manage Service Capacity Requirements Procedure Rules............................................17
Manage Service Capacity Requirements Procedure Narrative.......................................17
Verification..................................................................................................................... 18
Review Criteria............................................................................................................... 18
Escalation Criteria.......................................................................................................... 19
Risks.............................................................................................................................. 19
Manage Resource Capacity Requirements Procedure Flow:...............................................20
Manage Resource Capacity Management Procedures Rules........................................21
Manage Resource Capacity Requirements Procedure Narrative...................................21
Verification..................................................................................................................... 23
Review Criteria............................................................................................................... 23
Escalation Criteria.......................................................................................................... 23
Risks.............................................................................................................................. 23
Create & Distribute Capacity Reports Procedure Flow.........................................................24
Create & Distribute Capacity Reports Procedure Rules.................................................25
Create & Distribute Capacity Reports Procedure Narrative............................................25
Verification..................................................................................................................... 26
Review Criteria............................................................................................................... 26
Escalation Criteria.......................................................................................................... 26
Risks.............................................................................................................................. 26
Appendix 1: Relationship to Other Documents.....................................................................27
Appendix 2: Capacity RACI Chart........................................................................................28
Appendix 3: Phase 1 Capacity Management Scope............................................................30
Appendix 4: Communication Plan........................................................................................31
Appendix5: Forms, Templates.............................................................................................34
Page 2 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
INTRODUCTION
GENERAL INTRODUCTION
Capacity management ensures that the information technology processing and storage capacity is adequate to
the evolving requirements of the organization as a whole in a timely and cost justifiable manner.
DOCUMENT ORGANIZATION
This document is organized as follows:
Introduction
Capacity Management Policies
Capacity Management Process Flow
Process Measurements
Process Roles and Responsibilities
Process Critical Success Factors
Capacity Management Process Integration Points
1.0 –Manage Business Capacity Requirements Procedure
Manage Business Capacity Requirements Procedure Rules
Manage Business Capacity Requirements Procedure Narrative
Verification
Management Review Criteria
Escalation Criteria
Risks
2.0 –Manage Service Capacity Requirements Procedure
Manage Service Capacity Requirements Procedure Rules
Manage Service Capacity Requirements Procedure Narrative
Verification
Management Review Criteria
Escalation Criteria
Risks
3.0 -Manage Resource Capacity Requirements Procedure
Manage Resource Capacity Requirements Procedure Rules
Manage Resource Capacity Requirements Procedure Narrative
Verification
Management Review Criteria
Escalation Criteria
Page 3 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Risks
4.0 –Create & Distribute Capacity Reports Procedure
Create & Distribute Capacity Reports Procedure Rules
Create & Distribute Capacity Reports Procedure Narrative
Verification
Management Review Criteria
Escalation Criteria
Risks
Appendix 1: Relationship to Other Documents
Appendix 2: Raci Matrix
Appendix 3: Phase 1 Capacity Management Scope
Appendix 4: Communications Plan
Appendix 5: Forms, Templates
The Capacity Management process shall identify Capacity requirements on the basis of business
plans, business requirements, SLAs and MOU’s and risk assessments, and shall be consulted in
the development and negotiation of SLA’s and MOU’s.
Capacity Plans will be kept on file for 18 months after their expiry date.
The Capacity Plans will be reviewed at least annually to ensure requirements reflect agreed-upon
changes required by the business.
Capacity Management will endeavor to ensure optimal integration with other ITSM processes.
The best available demand forecasts should be provided to Capacity Management as soon as
they are identified.
Monitoring, data gathering, analysis, reporting, and reviews will be undertaken consistently in a
defined manner, with the data being stored in the Capacity Management Database (CDB).
The contents of the CDB will be shared with other ITSM processes.
The necessary authority will be delegated to the Capacity Management process to initiate actions
which ensure required levels of IT Service Capacity and reliability.
Page 4 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
1.0
Manage Business Capacity Requirements
4.0
3.0
Manage Resource Capacity Requirements
End
Page 5 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
It is worth noting that the operational progression of the sub-processes is from 1.0 to 2.0 to 3.0, but there is
also a flow from 3.0 to 2.0 and from 2.0 to 1.0.
This becomes important in situations where a capacity event in 3.0 will have a direct impact on a service.
Details of the event must be fed to 2.0, and 2.0 will, in turn, feed the details to 1.0.
Capacity Manager A Capacity Manager has responsibility for ensuring that the aims of Capacity
Management are met. This includes such tasks as:
Ensuring that there is adequate IT capacity to meet required levels of service,
and that senior IT management is correctly advised on how to match capacity
and demand and to ensure that use of existing capacity is optimized
Identifying, with the Service Level Manager, capacity requirements through
discussions with the business users
Understanding the current usage of the infrastructure and IT services, and the
maximum capacity of each component
Performing sizing on all proposed new services and systems, possibly using
modeling techniques, to ascertain capacity requirements
Forecasting future capacity requirements based on business plans, usage
trends, sizing of new services, etc.
Production, regular review and revision of the Capacity Plan, in line with the
organization’s business planning cycle, identifying current usage and forecast
requirements during the period covered by the plan
Ensuring that appropriate levels of monitoring of resources and system
performance are set
Analysis of usage and performance data, and reporting on performance against
targets contained in SLAs
Raising incidents and problems when breaches of capacity or performance
thresholds are detected, and assisting with the investigation and diagnosis of
capacity-related incidents and problems
Identifying and initiating any tuning to be carried out to optimize and improve
capacity or performance
Identifying and implementing initiatives to improve resource usage – for
example, demand management techniques
Assessing new technology and its relevance to the organization in terms of
performance and cost
Being familiar with potential future demand for IT services and assessing this on
performance service levels
Ensuring that all changes are assessed for their impact on capacity and
performance and attending CAB meetings when appropriate
Page 6 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Capacity Analyst The Capacity Analyst performs or directs many of the day-to-day and strategic
capacity activities on behalf of the Capacity Manager.
Reviews all Capacity reports with the Capacity Manager and publishes them after
approval.
Page 7 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Up-time reports
Ability to plan and implement appropriate capacity to match current and future business needs
Creation of an integrated source of capacity data to allow analysis of the usage of all Configuration Items
in scope
Senior management commitment in terms of resources and budget for the process
Page 8 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Page 9 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Page 10 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Page 11 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
Page 12 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
1.2
Review SLA’s
No
Configuration Management
Design/Procure/Amend
Configuration
Change Management
Implement Configuration
Configuration Management
Return
Update CMDB, CDB
Page 13 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
1.1 Capacity Manager, Receive the RFC for the new or changed service.
Quantify Capacity Analyst Review the details of the RFC.
Business Determine the impacts to the business of implementing this RFC.
Impacts Record the determined impacts.
Proceed to 1.2 – Review SLA’s.
1.2 Capacity Manager, Review all SLA’s for the service which will be impacted.
Review Capacity Analyst Determine the impacts on the current SLA’s of implementing the new
SLA’s or changed service.
Record the results of the analysis.
Proceed to 1.3 – Decision – Changes Required?
1.3 Capacity Manager Will changes be required to the infrastructure be required in order to
Decision – deliver the new or changed service as well as to maintain the current
Changes SLA’s?
Required? If “yes”, proceed to Service Level Management, which will
Negotiate, Obtain Agreement, and Sign the amended SLA.
If “no”, proceed to Service Level Management, which will define and
obtain agreement on Service Level Requirements.
Page 14 of 36
FNAL_Capacity_
Management_Policy_Process_Procedures_Draft_V0_6
1.4 Capacity Analyst, In consultation with the teams impacted by this new or changed
Develop & Capacity Manager service, develop Service Level Requirements.
Agree on Agree and document the SLR’s.
SLR’s Interface with the Configuration Management processes to design,
amend or procure items for the new configuration.
After that, invoke the Change Management processes to deploy the
change.
Finally, interface with the Configuration Management processes to
update the CMDB (Configuration Management Database) and the
CDB (Capacity Database).
Note: The Service Level Manager will be kept informed of the fact that
new SLR’s have been defined and agreed upon, since the delivery of
those requirements will have a direct impact on the services for which
there are SLA’s.
Return Exit this sub-process and return to the calling process (0.0)
Page 15 of 36
Completion and validation that the change has executed according the Change
Exit Criteria Management Process.
CMDB and CDB updated as appropriate
Closed RFC
Outputs Updated CMDB
Updated CDB.
VERIFICATION
Result Action
Closed RFC Proceed
Updated CMDB Proceed
Updated CDB. Proceed
ESCALATION CRITERIA
Event Action Notification
Dispute between Capacity Escalation through normal The Change Manager is the ultimate
Management and Change management chain decision maker.
Management on the criteria for
success of the change.
RISKS
Risk Impact
No approved RFC Reduced chance that the change will be successfully
applied
Databases not updated Risk that there will continue to be service delivery
issues because the changes have not been recorded,
rendering impact analysis less than effective.
MANAGE SERVICE CAPACITY REQUIREMENTS PROCEDURE FLOW
2.2
Monitor, Evaluate & Report
2.3
Identify Trends
2.4
Establish Normal Service
Operation Levels
2.5
Define Exception Levels
2.6
Report Service Breaches & Near
Misses
2.2 Capacity Analyst Ensure that regular monitoring is being performed on those
components that have been identified as critical to the provision of
Monitor,
services as defined in SLA’s.
Evaluate &
Report Perform regular evaluation of the capacity from the perspective of
its general health for service provision.
Produce regular reports on the findings, and distribute the reports
to the Service Level Manager and to the managers of the various
infrastructure components.
Proceed to 2.3 - Identify Trends.
2.3 Capacity Manager Identify any trends which emerge as a result of the regular
monitoring and evaluation of the components involved in the
Identify Trends
delivery of the services.
Document the trends so that they can be used in future analyses.
Proceed to 2.4 – Establish Normal Operation Levels.
2.4 Capacity Manager, When there is sufficient data, establish the normal operation levels
Establish Capacity Analyst for the components involved in the delivery of the services.
Normal Document those normal operation levels and obtain agreement
Operation from the appropriate managers.
Levels Proceed to 2.5 – Define Exception Levels.
MANAGE SERVICE CAPACITY REQUIREMENTS PROCEDURE NARRATIVE
2.5 Capacity Manager Establish tolerance levels for each of the components, and obtain
agreement from the appropriate managers.
Define
Exception Document the agreed-upon tolerances so that appropriate
Levels responses can also be defined.
Proceed to 2.6 – Report Service Breaches and Near Misses.
2.6 Capacity Manager Have a standard format for reporting service breaches as well as
situations where the agreed-upon tolerances have been
Report Service
approached.
Breaches &
Near Misses As required, prepare this report for the Service Level Manager.
Proceed to Return.
Normal operation levels as well as tolerances for each identified component have been
identified and agreed-upon.
Exit Criteria
Service Breaches and near misses have been identified and reported to the Service Level
Manager.
Normal performance level definitions,
Exception level definitions,
Outputs
Performance Data,
Service Reports
VERIFICATION
Result Action
Monitoring completed for defined timescale Proceed
Trending is being performed on a regular basis Proceed.
Normal component performance levels have been Proceed
identified and documented for identified components
Exception levels have been identified and documented Proceed
for identified components
REVIEW CRITERIA
Result Action
Monitoring is not generating data to the level needed Management decides whether more detailed
e.g. to pinpoint to cause of a specific Capacity-related monitoring is required by balancing the need for the
event data against the potential impact that the generation
of large amounts of data may have on the IT
environment
ESCALATION CRITERIA
Event Action Notification
Service Level Agreements have Notify Service Level Management Service Level Manager
been breached
RISKS
Risk Impact
N/A
MANAGE RESOURCE CAPACITY REQUIREMENTS PROCEDURE FLOW:
3.1 3.6
Monitor Individual Hardware Balance Services to Use Existing
& Software Components Resources Efficiently & Effectively
3.7
Configuration Management
Evaluate New HW, SW & Personnel
Conduct Audits & Reviews
Capability
Availability Management
3.2
Design Resilience into IT
Collect Data
Infrastructure
3.4 3.8
Determine the Impacts of Finalize & Agree on the Capacity
Change Plan
3.5
Plan & Budget HW & SW
Upgrades & HR
Augmentation
Return
MANAGE RESOURCE CAPACITY MANAGEMENT PROCEDURES RULES
Normal performance level definitions,
Exception level definitions,
Inputs
Performance Data,
Service Reports
Entry
Sub-Processes 1.0 and 2.0 have been completed.
Criteria
General The purpose of the Manage Resource Capacity Requirements is to monitor, guard, analyze and
Comments tune the performance of the various components of the IT infrastructure.
3.1 Capacity Analyst Ensure that monitoring is functioning as intended for each of
Monitor Individual the components on which it is installed and activated.
Hardware & Proceed to 3.2 – Collect Data.
Software
Components
3.2 Capacity Analyst Collect the data for the components on which monitoring is
installed and activated.
Collect Data
Organize and collate the gathered data so as to allow for
analysis.
Pass this data to the Service Level Management Process,
which will perform audits and reviews on the components from
the perspective of their current and future capabilities to deliver
the service within the parameters agreed-upon by the SLA’s.
After the results of the audits or reviews have been returned
from Service Level Management, proceed to 3.3 – Perform
Preemptive and Reactive Problem Determination.
3.3 Capacity Analyst Review the results of the monitoring or the Reviews/Audits, as
well as the details of any Capacity Event if appropriate.
Perform Preemptive
and Reactive Determine the probable cause of any actual or potential
Problem capacity problems.
Determination Identify potential solutions to the problems.
Record the details of this activity
Proceed to 3.4 - Determine the Effects of Change.
3.4 Capacity Analyst Decide which techniques are appropriate for determining the
effects of a proposed change.
Determine the
Effects of the As appropriate, perform trending, or modeling.
Change Determine training requirements for the proposed change.
Document the findings.
Proceed to 3.5 – Plan & Budget HW & SW Upgrades & HR
augmentation.
MANAGE RESOURCE CAPACITY REQUIREMENTS PROCEDURE NARRATIVE
3.7 Capacity Manager, Evaluate the capabilities of any new hardware components
Evaluate new HW, Capacity Analyst which have been introduced into the environment.
SW & Personnel Evaluate the capabilities of any new software which has been
Capability introduced into the environment.
Evaluate the capabilities of personnel to manage the new
hardware or software, especially when if those additions have
increased the workload.
Document the results of the evaluations, and distribute them to
the appropriate personnel.
Proceed to 3.8 – Finalize & Agree on the Capacity Plan.
3.8 Capacity Manager Collate all of the required elements for the new or updated
Finalize & Agree on Capacity Plan.
the Capacity Plan Create or Update the Capacity Plan.
Obtain agreement for the new or updated plan from the
appropriate support teams, as well as from the Service Level
Manager.
Record the current Capacity Plan in the CDB.
Proceed to Return.
Exit Criteria The Capacity Plan is completed and agreed upon by all appropriate parties.
REVIEW CRITERIA
Result Action
The Capacity Plan is not at the level required to Management decides what should be included in the
adequately manage service agree-upon service plan to provide adequate capacity control, and initiates
delivery levels a change to incorporate those items.
ESCALATION CRITERIA
Event Action Notification
Agreement cannot be reached Notify Service Level Management Service Level Manager
on the Capacity Plan.
RISKS
Risk Impact
No agreed-upon Capacity Plan Service breaches because of Capacity events.
CREATE & DISTRIBUTE CAPACITY REPORTS PROCEDURE FLOW
4.2
Define or Validate Audience
4.3
Identify Data Sources
4.4
Gather & Analyze Data
4.5
Produce Report
4.6
Return
Distribute Report
CREATE & DISTRIBUTE CAPACITY REPORTS PROCEDURE RULES
Request for an ad hoc Capacity Report
Inputs Capacity data ( varies, depending on the nature of the report)
Automated Tool CI Identification
Entry
For ad hoc requests, a valid service request must be created.
Criteria
General
Comments
4.1 Capacity For regularly scheduled reports ensure that the requirements are
still valid.
Define or Validate Manager
Requirements For ad hoc reports define the reporting requirements.
Proceed to 4.2 – Define or Validate Audience.
4.2 Capacity For regularly scheduled reports ensure that the identified
audience is still valid.
Define or Validate Manager
Audience For ad hoc reports define the appropriate audience for the
report.
Proceed to 4.3 – Identify Data Sources.
4.3 Capacity Determine the sources for the data which will be used in the
Identify Data Sources Manager report.
This will probably require both the name of the data store, as
well as the appropriate fields within that store.
Proceed to 4.4 – Gather & Analyze Data.
Return Exit the Create & Distribute Capacity Reports sub-process and
return to the calling process.
Exit Criteria Completed and distributed Capacity report
VERIFICATION
Result Action
Requirements defined or validated. Proceed
Audience defined or validated. Proceed
Data sources identified. Proceed
Report produced Proceed
REVIEW CRITERIA
Result Action
The Capacity Plan is not at the level required to Management decides what should be included in the
adequately manage service agree-upon service plan to provide adequate capacity control, and initiates
delivery levels a change to incorporate those items.
ESCALATION CRITERIA
Event Action Notification
Agreement cannot be reached Notify Service Level Management Service Level Manager
on the Capacity Plan.
RISKS
Risk Impact
No agreed-upon Capacity Plan Service breaches because of Capacity events.
APPENDIX 1: RELATIONSHIP TO OTHER DOCUMENTS
Document Name Relationship
Capacity Management Business Process Requirements Requirements
Service Improvement Process & Procedures Process, Procedure
Service Level Management Process & Procedure Process, Procedure
Capacity Management Plan Template Template
R - Responsible Role responsible for getting the work done Primary Roles in Process
A - Accountable Only one role can be accountable for each activity Primary Interactions
C - Consult The role who are consulted and whose opinions are sought Secondary Roles
I - Inform The role who are kept up-to-date on progress
Capacity Manager
Capacity Analyst
Change Manager
Support Teams
Process Owner
Configuration
Service Level
Management
Manager
Manager
1.0 – Manage Business Capacity Requirements
1.1 Quantify Business Impacts A R C I I I I
1.2 Review SLA’s A I C I I I I
1.3 Decision - SLA Changes Required? A I C I I I I
Processor Utilization
Memory Utilization
Network
Capacity
Port Availability
Switches
Bandwidth
Mb/s
Gb/s
Storage
Capacity (GB, TB)
Bandwidth (IOPS)
Database & Infrastructure Services
Transaction Rates
Peak Transactions / Second
Mean time to complete
Tape
Mounts / Hour
Occupied Slots vs. Slots Available
Account/Password Services
Network Services
Print Services
APPENDIX 4: COMMUNICATION PLAN
Key messages:
The Capacity Management process is focused on ensuring that the information technology processing and storage
capacity is adequate to the evolving requirements of the organization as a whole in a timely and cost justifiable
manner.
In order to achieve this, all stakeholders must be informed of the importance of each of them fulfilling his or her role
in the process. There must also be continued efforts to ensure that any pertinent changes are communicated to the
community.
Approach:
This plan details tasks that apply generally to all ITIL processes. The plan assumes that there will be a combination
of face-to-face training/meeting events and broadcast communications designed to both increase awareness of the
processes among stakeholders and to ensure high performance of the new processes among key service delivery
staff.
Each of the above types of communication can be delivered via one or more of the following mediums: