ITSM Process Description: Office of Information Technology Incident Management
ITSM Process Description: Office of Information Technology Incident Management
ITSM Process Description: Office of Information Technology Incident Management
Incident Management
The content within this general overview is based on the best practices of the ITIL ® framework[1].
Incident Management is the process responsible for managing the lifecycle of all Incidents irrespective
of their origination.
To achieve this, the objectives of OIT’s Incident Management process are to:
Ensure that standardized methods and procedures are used for efficient and prompt response,
analysis, documentation, ongoing management and reporting of Incidents
Increase visibility and communication of Incidents to business and IT support staff
Enhance business perception of IT through use of a professional approach in quickly resolving
and communicating incidents when they occur
Align Incident management activities and priorities with those of the business
Maintain user satisfaction with the quality of IT services
CSFs identified for the process of Incident Management and associated Key Performance Indicators
(KPIs) are:
CSF #1 - OIT commitment to the Incident Management process; all departments using the same process.
KPI 1.1 - Number of self service tickets via a customer portal verses tickets created by the
Service Desk.
1.1.1 - Review metrics via ITSM tool on all incident requests recorded and escalated within OIT.
KPI 1.2 - Management is known to review standardized reports produced by the Incident
Management process.
ITSM Process Description- Incident Management
3
1.2.1 - ITSM tool, standardized/customized reports made available.
KPI 1.3 - Number of incidents in ITSM tool per department.
1.3.1 - Review metrics via ITSM tool on all incident requests recorded and escalated within OIT.
KPI 1.4 - Management is known to be a user of the Incident Management process.
1.4.1 - Review metrics via ITSM tool on all incident requests recorded and escalated within OIT.
CSF #2 - Consistent, positive experience for all customers
KPI 2.1 - Improved assignment, response and closure time.
2.1.1 Review metrics via ITSM tool on all incident requests recorded and escalated within OIT
specifically focusing on MTTR and customer satisfaction surveys. KPI 2.2 - Customer use of self
service portal increases.
2.2.1 Review metrics via ITSM tool on all incident requests recorded via self service portal.
KPI 2.3 - Amount of journal entries consistent with SLA.
2.3.1 Review metrics via ITSM tool for services with SLA specifically focusing on the quantity and
quality of updates in incident requests.
KPI 2.4 - number of incidents reopened.
2.4.1 Review metrics via ITSM tool specifically looking at incidents that were reopened.
CSF #3 Ability to track internal process performance and identify trends.
KPI 3.1 - Process performance meets established standards in OIT Baseline SLA including:
Assignment time, response time, resolution time, closure time.
3.1.1 Review metrics via ITSM tool on all incident requests recorded and escalated within OIT;
measuring MTTR and SLA requirements.
KPI 3.2 - Number of re-assigned tickets between departments.
3.2.1 Review metrics via ITSM tool on all incident requests recorded specifically looking at
incidents that were reassigned.
Incident Management encompasses all IT service providers, internal and third parties, reporting,
recording or working on an Incident.
All Incident Management activities should be implemented in full, operated as implemented, measured
and improved as necessary.
4. Benefits
There are several qualitative and quantitative benefits that can be achieved, for both the IT service
providers and users, by implementing an effective and efficient Incident Management process. The
Incident Management project team has agreed that the following benefits are important to OIT and will
be assessed for input to continuous process improvement throughout the Incident Management process
lifecycle:
Capturing accurate data across OIT to analyze the level of resources applied to the Incident
Management process
Informing business units of the services OIT provides and the level of support and maintenance
required for ongoing service levels
Minimize impacts to business functions by resolving incidents in a timely manner
Providing the best quality service for all users
The following are key terms and Best Practice definitions used in Incident Management. The Incident
Management Project Team carefully read and agreed to each key term. Any changes and/or additional
key terms should be listed, defined and agreed in this section.
Note: Key terms and definitions must be verified and documented consistently across all ITIL processes
implemented in the organization.
Escalation: An Activity that obtains additional resources when these are needed to meet service level
targets or user expectations. Escalation may be needed within any IT service management process but is
most commonly associated with Incident Management, Problem Management and the management of
user complaints. There are two types of escalation: functional escalation and hierarchical escalation.
Event: Any change of state that has significance for the management of an IT service or other
configuration item. The term can also be used to mean an alert or notification created by any IT service,
Configuration Item or a Monitoring tool. Events typically require IT Operations personnel to take actions
and often lead to Incidents being logged.
Failure: Loss of ability to operate to specification, or to deliver the required output. The term Failure
may be used when referring to IT services, processes, activities and Configuration Items. A Failure often
causes an Incident.
Function: A team or group of people and the tools they use to carry out one of more Processes or
Activities; for example, the Service Desk.
Group: A number of people who are similar in some way. People who perform similar activities, even
though they may work in different departments within OIT.
Impact: A measure of the effect of an Incident, Problem, or Change on Business Processes. Impact is
often based on how Service Levels will be affected. Impact and urgency are used to assign priority.
Incident Management: The process responsible for managing the lifecycle of all Incidents. The primary
purpose of Incident Management is to restore normal IT service operation as quickly as possible.
Incident Record: A record containing the details of an Incident. Each Incident record documents the
lifecycle of a single Incident.
Normal Service Operation: The Service Operation defined within the Service Level Agreement (SLA)
limits.
Primary Technician: The technician who has responsibility for correcting the root cause issue and must
keep users informed of progress. They are also responsible for coordinating child records.
Priority: A category used to identify the relative importance of an Incident, Problem or Change. Priority
is based on impact and urgency and is used to identify required times for actions to be taken. For
example, the SLA may state that Priority 2 Incidents must be resolved within 12 hours.
Priority 1 Incident: The highest category of impact for an Incident which causes significant disruption to
the business. A separate procedure with shorter timescales and greater urgency should be used to
handle Major Incidents.
Quality Assurance (QA): Optional departmental process for ensuring a desired level of customer service.
This process is defined by the departments that choose to review tickets prior to closure.
RACI Matrix: A responsibility matrix showing who is Responsible, Accountable, Consulted and Informed
for each activity that is part of the Incident Management process.
Role: A set of responsibilities, activities and authorities granted to a person or team. A role is defined in
a process. One person or team may have multiple roles; for example, the roles of Configuration Manager
and Change Manager may be carried out by a single person.
Service Desk: The Single Point of Contact between the Service Provider and the users. A typical Service
Desk manages Incidents and Service Requests and also handles communication with the users.
Severity: A measure of how long it will be until an Incident, Problem or Change has a significant impact
on the business. For example, a high Impact Incident may have low urgency, if the impact will not affect
the business until the end of the financial year. Impact and urgency are used to assign Priority.
Tier 2: More in-depth technical support than tier 1. Tier 2 support personnel may be more experienced
or knowledgeable on a particular product or service. Additionally, Tier 2 may be able to provide onsite
troubleshooting and/or resolution. Specialized departments (i.e. Networks, Servers, Video) will provide
Tier 2 Support in their respective areas of expertise.
User: Someone who uses the IT service on a day-to-day basis. Sometimes informally referred to as the
customer.
Some process roles may be full-time jobs while others are a portion of a job. One person or team may
have multiple roles across multiple processes. Caution is given to combining roles for a person, team or
group where separation of duties is required. For example, there is a conflict of interest when a software
developer is also the independent tester for his or her own work.
Regardless of the scope, role responsibilities should be agreed by management and included in yearly
objectives. Once roles are assigned, the assignees must be empowered to execute the role activities and
given the appropriate authority for holding other people accountable.
All roles and designated person(s), team(s), or group(s) should be clearly communicated across the
organization. This should encourage or improve collaboration and cooperation for cross-functional
process activities.
Profile The person fulfilling this role is responsible for ensuring that the process is
being performed according to the agreed and documented process and is
meeting the aims of the process definition.
There will be one, and only one, Incident Management Process Owner.
6.6 User
Profile Any person who reports an incident or requests a change. This person may
come from many of the ITSM roles to included, but not limited to: User,
Service Owner, Service Provider, or Tier 1/Tier 2 Technician.
15
8.0 Incident Management Tier 1 Process Flow
17
D.1.3 Related to Open Incident?
Purpose Tier 1 combines similar service requests into one incident. The
purpose of relating records is to minimize the impact to Tier 2
resources.
Accountable
Consulted
Informed
26
Inputs Incident Record
Procedure or Work Instruction Open a new/related Incident record classified and
Steps prioritized appropriately
o If the new/related Incident record is a high
priority, confirm that the necessary department
received the new/related Incident record.
Outputs Updated Incident Record
D.3.4 Is Additional A R R R
Incident Coordinator
Review Required?