ARMS

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Jan-09

Presented by:

Jari NISULA
Mgr, Airline Safety Mgt Systems
Airbus

Operational Risk Assessment


Next Generation Methodology

On behalf of the ARMS Working Group

ARMS stands for Airline Risk Management Solutions.

Page 1
Presentation contents

1. Do we need a New Methodology?

2. The ARMS Mission

3. The two levels of ARMS Deliverables

4. The ARMS Methodology

5. Risk Management in the organizational context

2009 Page 2

Page 2
Central role of ”Risk” in the SMS framework

 Safety policy and objectives


1.1 – Management commitment and responsibility
1.2 – Safety accountabilities of managers
1.3 – Appointment of key safety personnel
1.4 – SMS implementation plan
1.5 – Coordination of emergency response planning
1.6 – Documentation
 Safety risk management
2.1 – Hazard identification processes
2.2 – Risk assessment and mitigation processes
 Safety assurance
3.1 – Safety performance monitoring and measurement
3.2 – The management of change
3.3 – Continuous improvement of the SMS
 Safety promotion
4.1 – Training and education
4.2 – Safety communication

2009 Page 3

Risk Management has a very central role in the new SMS Framework, introduced
by ICAO.

The component 2 of the SMS Framework, “Safety Risk Management” is the part
where safety is concretely delivered, by identifying hazards, risk assessing them,
and by taking action to manage the risks.

Risk-based information is also very useful for “Safety Assurance”: What is a


better way to monitor Safety Performance (point 3.1 on slide) than follow the risk
levels, as produced by the Risk Assessment process!

The Management of Change (point 3.2 on slide) process often requires making a
Risk Assessment (or a “Safety Assessment”) on the new planned activity; for
example, opening a new route or introducing a new aircraft type. This again calls
for a good practical method.

Let’s now look more in depth into the component 2, “Safety Risk Management”…

Page 3
Risk Assessment within Risk Management

2009
(ICAO SMM)
Page 4

This chart comes from the ICAO SMM (edition 1). The Risk Management
process starts with Hazard Identification (HI). For an airline, this consists
typically of things like Flight Data Analysis, Safety Reporting…etc. This is
an area which has improved drastically in the last 10 years, and today, an
airline can have access to a large amount of very proactive safety data.

The second part (on red) is the Risk Assessment; in terms of severity,
probability and acceptability. This is the difficult bit, and this is what the
rest of the presentation will focus on.

Finally, the last part (on yellow) is the Risk Mitigation part. This is about
taking action in order to make sure that all risks remain at acceptable
levels. This is related to many organizational issues and even if it has its
own challenges, they are not related to the Risk Assessment Methodology
itself. A typical arrangement is to use the Safety Review Board and Safety
Action Groups to take care of this part.

Page 4
Objectives for a Risk Assessment methodology

Hazard Identification data Operational Risk Profile

Planned changes
RA Associated Risk

Inputs: Methodology: Results:


• Accepts all types • Simple and • Coherent
of modern safety fast • Useful
data. • Conceptually • Understandable
solid. by non-experts.

Aviation specific New better method


2009 Page 5

Before we start discussing any methodology for Risk Assessment, we


should first be very clear about the overall objectives for Risk Assessment.

There are two main inputs. The first one is the operational Hazard
Identification data (produced by the source listed on blue on the previous
slide). The Risk Assessment method should be able – based on that data
– to create a good overview of operational risks; we could call this the
Operational Risk Profile.

The second input is a planned change. This comes back to the


“Management of Change” process, where something new is started, so by
definition there is no in-house data available for risk assessment, but a
proactive “future risk assessment is still necessary”. The RA method
should help assess the Risk associated to the planned change.

We can now list objectives concerning the acceptable inputs, the method
itself and the results (see bullets).

These requirements lead to two main conclusions. First, due to the


specificity and quantity of the input data, the method needs to be
specifically adapted for aviation. Secondly, it is argued that none of the
existing methods fulfill the listed requirements on the methodology and the
results. Therefore, a new better method needs to be developed.

Page 5
Problems with older methods – fictitious example

• You learn about an event which took place yesterday:


A single-aisle aircraft with 110 pax almost overran runway
end at landing
Actual outcome: a few blown tires
Cause: reduced braking capability due to maintenance error

Classic approach to
Risk Assessment :

2009 Page 6

Let’s look at some of the problems with older methods.


Typically, an incoming event is classified in terms of risk using a matrix
with two dimensions: severity and likelihood. The risk assessment
becomes an exercise of picking the “right” square in the matrix.

This may seem as a simple task, but a closer study of the problem reveals
fundamental problems caused by a deficient underlying conceptual
framework.

Page 6
Fictitious example (cont’d)

• Severity of what?
Actual outcome: blown tires?
Most likely potential accident scenario: overshoot with some
injuries & few fatalities (if any)?
The worst-case scenario: overshoot with 100% fatalities?
Shall you consider bigger A/C? More pax? Critical airports?

• Likelihood of what?
The same maintenance error?
Near-overshoot events?
Actual overshoot events?
Any A/C type? Any location?

2009 Page 7

Severity…but severity of what? The actual concrete outcome? The most


probable accident scenario, or perhaps the worst-case scenario?* Should
we take into account that there could have been more passengers on
board, especially if bigger aircraft types are considered? Should we
consider that this could have happened at a more critical airport?

Likelihood…but of what? Even if we have thousands of events in a


database, we first need to define what type of events we are looking for, to
determine the statistical frequency and thereby the estimated likelihood.

All these questions are difficult to answer. Different analyst defend


different options, and often such questions are not even assessed
consciously. This type of conceptual confusion is a great source of
subjectivity in risk assessment.

Where do all these problems of assessment come from? After all, the
event is a historical fact – shouldn’t it therefore be easy to risk-assess it?

*The ICAO definition of risk refers to the “worst foreseeable situation”, which usually
equals to 100% fatalities. But this is not the same as the “most probable accident
outcome”, which in the real life may be a more useful concept.

Page 7
Conceptual confusion on historical events

• When dealing with historical events, the only factual element


is the actual outcome
But that in itself is not very interesting
Focus is on a potential similar future event, which could
escalate into an accident.

• “Similar” is very subjective


 Speculation, estimation

• Further question:
Should we assess events or Safety Issues?

2009 Page 8

It is important to realize two things concerning the risk analysis of


historical events: First, the only factual part is the event exactly as it
happened and its actual outcome. Secondly, the historical event in itself is
actually not the real interest of the risk assessment. Factually, if it ended
well, the risk is zero. Furthermore, there is nothing we can do about the
historical event.

Rather, we are interested in the capability of some hazard reflected by the


event to harm us in the future. Such assessment is no longer based on
factual elements, but subjective projections. And when one starts
projecting into future, the whole assessment becomes completely under-
defined. For example, this event took place on an A320, shall we only
consider A320’s, or also other aicraft types? This event took place in HKG,
shall we also consider other airports – and which? This took place on a
Sunday…in clear weather…with experienced crew…with one thrust
reverser inop…etc, etc.

These questions are usually not addressed consciously and


systematically, which makes the object of the risk assessment unclear and
the whole assessment deficient. On the other hand, one cannot only limit
the study to the event exactly as it happened, because it will never re-
occur exactly the same way.
One can even ask the question: should risk assessment only be done on
Safety Issues, because they can be clearly scoped and defined?

Page 8
Further problems

• If your initial “likelihood” is LOW…


When more “similar” events occur, are you going to update
the likelihood of all previous “similar” events to “MEDIUM”
Which events are “similar” enough?
If even more occur, update all again to “HIGH” likelihood??

• Are you going to sum these event risk values together?


(severity
x frequency) x frequency ???
 Frequency is counted twice

• How do you estimate the impact of potential extra barriers


(risk controls)?

2009 Page 9

Trying to assess the likelihood when dealing with individual events causes
other problems. Importantly, when an event type becomes more frequent,
one should re-assess the risk of previous events by correcting the
likelihood higher, otherwise their risk level does not reflect the increased
likelihood. Such continuous re-assessment is not feasible in the real-life
context.

Moreover, if one wants to estimate the total risk over an event type (e.g.
TCAS events during approach to LHR), the temptation is to sum together
the risk values of the individual events. If likelihood was one the two axes
in the initial assessment, likelihood is now being taken into account twice
vs. severity only once. The answer is flawed.

Another problem is trying to assess the risk reduction impact of barriers


that are not in place, but could be put in place. There is no methodological
guidance for this step, which becomes an extremely subjective estimate.

Page 9
List of problems with older methods

1. Conceptual confusion on historical events


2. Confusion between events and Safety Issues
3. Should not limit thinking to actual outcomes
4. Potential outcomes are very subjective
5. Complexity of real world: makes situation worse
6. Complexity of barriers: difficult to estimate effectiveness
7. Guidance should not link with actual outcome only
8. Guidance should not be too vague either.

2009 Page 10

2. The whole concept of risk assessment of historical events is strange.


We come back to the concept of Safety Issue later.

5. The aviation system with its various actors, technology and variable
conditions is extremely complex.

6. The system of barriers (risk controls) in itself if usually very complex.


Some barriers are vital: their failure makes the whole system fail. Some
others are in series – their failure reduces the safety margin but as such
do not cause an immediate impact, if other barriers are still in place. There
are interactions between barriers, and often the humans can by-pass or
de-activate barriers.

8. Typically, words like “severe” and“occasional” mean different things for


different people. They are so vague that if they are used in a risk
assessment matrix (without further guidance), the results are not coherent.

Page 10
ARMS Methodology

1. Do we need a New Methodology?

2. The ARMS Mission

3. The two levels of ARMS Deliverables

4. The ARMS Methodology

5. Risk Management in the organizational context

2009 Page 11

Page 11
Airline Risk Mgt Solutions (ARMS) Working Group

• Aim: Significantly improved methodology


• Safety practitioners from airlines and other organizations
• Over 150 man-days of work since Jun-07
• Two levels of deliverables by the end of 2008:
Conceptual methodology  Universal
Matrices etc.  Customizable at company level

2009 Page 12

Due to the complexity of aviation and the nature of risk assessment, it will
never be 100% scientific and objective, but we are convinced it can be
done significantly better than with existing methods and that’s our aim.

The result are valuable only if they are actually useful in the real-life
operational context. We wanted this methodology to be developed by
operational practitioners, so that almost by definition the result is
pragmatic.

As you can see, we have people from airlines, maintenance organizations,


the ATC domain and other aviation organizations. The resulting
methodology is the fruit of excellent contribution by many people from
various organizations.

The ECAST SMS WG took ARMS as the reference for operational risk
assessment, not trying to duplicate the ARMS work in any way.

Page 12
ARMS Mission Statement

The Mission of the ARMS Working Group is to produce useful and cohesive Operational Risk
Assessment methods for airlines and other aviation organizations and to clarify the related
Risk Management processes.

The produced methods need to match the needs of users across the aviation domain in terms of
integrity of results and simplicity of use; and thereby effectively support the important role that Risk
Management has in Aviation Safety Management Systems.

Through its deliverables, the Working Group also aims at enhancing commonality of Risk
Management methodologies across organizations in the aviation industry, enabling increased
sharing and learning.

In its work, the Working Group seeks contribution from aviation safety experts having knowledge on
the user needs and practical applications of risk management in the operational setting.

The deliverables of the Working Group will be methodology definitions – not necessarily software
tools. The first results will be delivered before 1-Jan-09 after which the potential continuation of the
work will be reviewed.

The results of the Working Group will be available to the whole industry.

2009 Page 13

Page 13
ARMS Methodology

1. Do we need a New Methodology?

2. The ARMS Mission

3. The two levels of ARMS Deliverables

4. The ARMS Methodology

5. Risk Management in the organizational context

2009 Page 14

Page 14
Level 1 deliverable:
Conceptual methodology
On light blue background

2009 Page 15

Perhaps the most important deliverable of the ARMS working group is the
conceptual methodology for Operational Risk Assessment.

This covers the developed concepts, terminology definitions, explanation


of how risk assessment is carried out and the organizational aspects of
risk assessment.

This part of the deliverables should be universally applicable to all aviation


(and similar) organizations.

In this presentation, these deliverables are shown on a light blue


background.

Page 15
Level 2 deliverable:
Example application
On yellow/orange background
A little “C” in the corner reminds that this part may
sometimes be further customized for specific contexts.

2009 Page 16 C
In addition to the conceptual methodology, the ARMS group has develop a
concrete example application, including all necessary matrices and
guidance text.

Most aviation organizations should be able to use this detailed


methodology as such, but it should be expected some customization may
be preferable or even necessary for some organizations. The working
group gives guidance on how such customization can be done without
compromising the overall methodology.

ARMS deliverables at this detailed level are presented on an orange


background with a “C” in the bottom right corner.

Page 16
ARMS Methodology

1. Do we need a New Methodology?

2. The ARMS Mission

3. The two levels of ARMS Deliverables

4. The ARMS Methodology

5. Risk Management in the organizational context

2009 Page 17

Page 17
Key points of the ARMS Methodology

• Full description of the Risk Assessment Process, step-by-step


Key focus on identifying Safety Issues and risk assessing them

• Initial Risk Classification of incoming safety events (Event


Risk Classification, ERC)
New conceptual instruments for dealing with Risk Assessment
related to historical events

• Safety Issue Risk Assessment (SIRA) method


Extended definition of Risk, incorporating the effect of barriers

• Safety Assessments of “future risks” can be made with the


same SIRA method.
2009 Page 18

Before going into the Terminology and the Methodology itself, here the
key points of this new Methodology summarized on one slide.

Page 18
Terminology

• Hazard – Condition, object or activity with the potential of


causing injuries to personnel, damage to equipment or
structures, loss of material, or reduction of ability to
perform a prescribed function. (ICAO)

• Safety Issue is a manifestation of a hazard or


combination of several hazards in a specific context. The
Safety Issue has been identified through the systematic
Hazard Identification process of the organization. A SI
could be a local implication of one hazard (e.g. de-icing
problems in one particular aircraft type) or a combination
of hazards in one part of the operation (e.g. operation to a
demanding airport). (ARMS)

2009 Page 19

In order to talk the same language, we have listed on the next few slides,
the Terminology definitions used by the ARMS group.

As far as possible, we use existing definitions and avoid making new


ones.

We used ICAO’s definition of Hazard as such.

Safety Issue is a very important concept for us. In everyday language,


Safety Issue is a safety problem that you have identified as one in your
operation. It is usually the local, specific implication of a generic hazard
(e.g. Windshear in approach to HKG) but it could also be a combination of
hazards present at once, e.g. landing to Quito (terrain, short runway,
displaced ILS, tailwind, wet runway, high altitude, etc.).

Why is Safety Issue such an important concept? Two reasons. First of all,
you can do something about Safety Issues. Managing Safety pretty much
equals managing your Safety Issues. Secondly, you can define a Safety
Issue very precisely and therefore carry out a good Risk Assessment
without much room for subjectivity.

Page 19
Terminology

• (Safety) Event
Any happening that had or could have had a safety impact,
irrespective of real or perceived severity (ARMS)

• Undesirable Event (UE): The stage in an accident


scenario where the scenario has escalated so far that
(excluding providence) the accident can be avoided only
if an recovery measure is available and activates. Risk
Controls prior to the UE are part of Avoidance and post-
UE are part of Recovery. (ARMS)

2009 Page 20

An event is basically anything that happened in the operation that at least


potentially could have had some kind of safety implication.

The Undesirable Event (UE) is closely related to the new conceptual


framework of Risk, based on four factors, instead of the old two (severity x
likelihood).
The UE is the point at which things start “getting out of hands”. This is the
limit between prevention (prior to the UE) and recovery (after the UE). The
Undesirable Event is therefore more an imaginary abstract concept than a
real-life event. It helps analyze the accident scenarios in a more
systematic manner and to assess the various barriers better.

Page 20
Terminology

RISK

• A state of uncertainty where some of the possibilities


involve a loss, catastrophe, or other undesirable outcome
(Doug Hubbard)

• Probability of an accident x losses per accident (classic


engineering definition)

• The predicted probability and severity, of the


consequence(s) of hazard(s) taking as reference the
potential outcomes. (adapted from ICAO by ARMS)

2009 Page 21

We started with the ICAO definition of Risk, but were forced to modify it a
little bit.

First of all, as risk is fundamentally “a state of uncertainty”, we did not like


saying like ICAO that “risk is an assessment”.

Secondly, we have discovered that the “worst foreseeable situation” is not


necessarily what you should be looking at in an assessment, so we
replaced those words by “potential outcomes” which catches the main
point that risk assessment should not be limited to the actual, real
outcomes.

Page 21
Preferred use related to “Risk Controls”

• Synonyms:
Risk Control
Barrier
Protection
Measures to avoid or to limit the bad
Defense outcome; through prevention, recovery,
mitigation. (SHELL)

• Used: Measures to address the potential hazard or


Risk Control to reduce the risk probability or severity.
(ICAO)
Barrier

• Not used:
Safety Barrier (misleading)
Protection, defense (for harmonization reasons)

2009 Page 22

To harmonize the language, among the several synonyms for “risk


control”, we use “barrier” and “risk control”.

Page 22
Not used due to several meanings

• Threat
Another meaning in the TEM context
Usually the word scenario can be used instead

• Mitigation
Classic= post-accident risk controls
ICAO = all risk controls (prevention, recovery, mitigation)
Used: controlling risks or reducing risks (verbs)
Used: Risk Controls, Barriers (nouns)

2009 Page 23

“Threat” is a difficult word, because it is largely used in classic Risk


Management literature, but has another meaning in Threat and Error
Management. ICAO does not use “threat” in the Risk Mgt context. We
decided to avoid using it, and to try to use “scenario” instead.

Mitigation again has two meanings. We try to avoid the word all together.

Page 23
Process summary – simplified schematic
Safety
Events

Event Risk Classification


30 100 300 1000

10 30 100 300
Urgent Actions?
3 10 30 100

Risk Reduction
1

Normal Trend Analysis

Risk Assessment of Safety Issues


Safety Issues
2009 Page 24

Let’s now go into the methodology itself. It is important to start from the
overall process. This is a simplified summary.

The starting point is the safety data, which flows in from Hazard
Identification. The incoming elements are typically events. Due to this fact,
and due to the need to screen for item requiring urgent actions, the first
step has to be a quick screening of all incoming events. The purpose is
not a thorough analysis, but only a first-cut classification.

The data flows into a safety database, which is used for trend analysis.
This may lead to actions due to increasing trends, etc, sometimes without
a formal risk assessment. A key step here is to identify the Safety Issues.

The Safety Issues (SI) are then subject to a detailed Risk Assessment.
Safety Issues are no longer single events, but well-defined Issues,
typically highlighted by several events.

Page 24
Safety
Events

Event Risk Classification (ERC)


30 100 300 1000

10 30 100 300

3 10 30 100

Investigations

All Data

Actions to
Data Analysis Safety
-Frequencies reduce risk
-Trends Performance
-Identification of Safety Monitoring
Issues

All collected
safety data
-Categorized
-ERC risk
index values

Safety Issue Risk


Assessment
“SIRA”

2009
2008-J.Nisula/Airbus Page 25

This is a more detailed presentation of the process. The same three


“loops” are visible: one going directly from the Event Risk Classification
(ERC) to investigation and action; second going from the Database
through Data Analysis and Performance Monitoring to Action; and the third
one going to Safety Issue Risk Assessment.

The ERC applies a specific risk assessment developed for historical


events to determine the urgency of associated action and whether the
event requires further analysis or investigation.

The Database has all the safety data in a structured format, enriched by
descriptors covering things like date, a/c type, location, flight phase. But
each event now also has a risk index value coming from the ERC. These
values can be used in statistics. Data Analysis is about looking at the data
with the help of the descriptors, statistical tools and graphs/charts in order
to detect Safety Issues. It is also the basis for monitoring the Safety
Performance.

Identified Safety Issues are risk assessed in the “Safety Issue Risk
Assessment” (SIRA). This will provide risk tolerability information on all
detected risks.

Finally, Risk values and related actions are monitored through the Risk
Register database.

Page 25
Event Risk Classification (ERC)

• All incoming data must be screened timely:


Urgent actions?
Further investigation / risk assessment necessary?
Just feed into the database?

• Historical Events: use “event-based risk”


Focus on one single event
Likelihood (“frequency”) not considered
Remaining Safety Margin
= Effectiveness of remaining risk controls
• Event-based risk:
How close did it get?
If this had escalated into
How bad would it have been? an accident, what would
have been the most
probable accident type?
2009 Page 26

The beginning of the process is the Event Risk Classification.

We spent a lot of time working on how to deal with historical events, and
the concept of risk related to them. The first conclusion, which we hope
makes sense to everybody, is that when dealing with an individual event,
we should not try to estimate its frequency.

When you ask the question: “what really makes an event worrying,
concerning, frightening?”, you realize there are two main factors: how bad
could it have been (as an accident) and how close did it get (to the
accident). The Risk Assessment of historical events is based on these two
dimensions, which translate to more specific questions.

What we are measuring is the risk experienced in the event under study,
that day, in those conditions. This acknowledges that some barriers have
already been breached, and what really matters is the remaining set of
barriers and their effectiveness. This is the Risk we measure with the ERC
matrix, presented on the next slide. If you look at tomorrow, the risk would
be different, because now you would assume all the barriers to be in place,
a priori.

Is the ERC value really a “risk” or just the “severity”? It could be


considered any of the two, but we prefer dealing with it as risk, and this is
fully in line with the Risk definition presented earlier.

Page 26
Event Risk Classification (ERC)
Question 2
What was the effectiveness of the remaining Question 1
barriers between this event and the most If this event had escalated into an
probable accident scenario? accident, what would have been the
Effective Limited Minimal Not effective most probable accident outcome?

Loss of aircraft or multiple


50 100 500 2500 Catastrophic
fatalities (3 or more)

1 or 2 fatalities, multiple
10 20 100 500 Major serious injuries, major
damage to the aircraft

Minor injuries, minor damage


2 4 20 100 Minor
to aircraft

No potential damage or
1 Negligible
injury could occur

• Risk index numbers developed based on accident loss data


• Long evolution of content, tested by several ARMS members
2009 Page 27 C
This is a concrete example of an ERC matrix.

We have guiding questions to take the user through the ERC assessment.
Having only 4 classes both ways helps making this assessment easily.
The guidance text for each class can be customized to specific
applications.

One has to keep in mind that the overall purpose is only to make an initial
estimate of the risk, so that the event is classified correctly. This is not the
final risk assessment. This classification should be possible even without
the guiding text, just based on the two questions.

Why is the bottom row just one block? Because if you say that this event
could not have escalated into an accident, then it makes no sense to
estimate the remaining safety margin.

Page 27
Event Risk Classification (ERC) - example

• Maintenance error, reduced braking capability. A single-


aisle aircraft with 110 pax almost overran runway end at
landing. Blown tires.

Question 2
What was the effectiveness of the remaining Question 1
barriers between this event and the most If this event had escalated into an
probable accident scenario? accident, what would have been the
Effective Limited Minimal Not effective most probable accident outcome?

Loss of aircraft or multiple


50 100 500 2500 Catastrophic
fatalities (3 or more)

1 or 2 fatalities, multiple
10 20 100 500 Major serious injuries, major
damage to the aircraft

Minor injuries, minor damage


2 4 20 100 Minor
to aircraft

No potential damage or

C
1 Negligible
injury could occur

2009 Page 28

Let’s use the earlier example.

The most probably accident outcome would have been a slow speed
overrun with injuries but without multiple fatalities. (This is a good example
of why we did not like the risk definition phrasing “worst foreseeable
situation” which would often be too severe).

There were no remaining barriers left. It was pure luck (or favorable
conditions) which made the plane stop on the runway and not just after. (A
physical net at the end of the runway would be such an extra barrier,
though).

This leads you to the red zone of the matrix with risk index 500.

Page 28
Event Risk Classification (ERC) - RESULT

• Example of results’ meaning:

 Investigate immediately and take action.

 Investigate or carry out further Risk Assessment

 Use for continuous improvement (flows into the Database).

2009 Page 29 C
The first result is the color.
Typical examples of the color’s meaning are presented above. These are
naturally subject to adaptation in each organization.

Page 29
Event Risk Classification (ERC) - RESULT

• The ERC will also produce a numerical Risk Index value for
each event

• The Index is an estimated risk value


Can be used to quantify risk
Useful for summing up risks of similar events and making
statistics
Helps in identifying Safety Issues

• Examples: 50 100 500 2500

Risk per each airport


10 20 100 500
Risk per flight phase
Risk per time of year 2 4 20 100

Etc.

C
1

2009 Page 30

The second result is the risk index value.

These values can be used numerically in statistics to quantify risk.

The values (which can naturally be customized) have been derived semi-
scientifically by looking at insurance data on accidents. The date shows
that the amount of loss in different categories of accidents was roughly
1:5:25. The objective is also to create roughly exponential scales both
ways and make sure the difference between the lowest and highest value
is at least about 1000.

You can ask yourself how many of your least severe events you would
need, to consider their cumulative risk as high as that of one of your most
severe events (fatal accident avoided by pure luck).

Page 30
Data Analysis - example

Unstabilized approaches per airport

40 3500
Number
35 3000

Accumulated ERC index


Rate
Event count and %

30
Total ERC 2500
25
2000
20
1500
15
1000
10

5 500

0 0
AAA BBB CCC DDD EEE
Airport

2009 Page 31 C
This is an example of Data Analysis (see next slide) and the use of ERC
risk index values.

Just looking at the absolute numbers of events (in this case unstabilized
approaches) can be misleading. Using rates is better, because the data is
normalized based on the exposure data. But still, it is only looking at
frequency of events, not their severity or risk.

By summing the ERC values of the events (in this case per airport), one
gets an estimate of the risk of these events, cumulatively, per airport. This
can give a completely different picture, like the above example illustrates.

Each graph tells a true but a different story and it is important to look at
each one of them.

Page 31
Safety
Events

Initial Risk Categorization (IRC)


30 100 300 1000

10 30 100 300

3 10 30 100

Investigations

All Data

Actions to
Data Analysis Safety
-Frequencies reduce risk
-Trends Performance
-Identification of Safety Monitoring
Issues

All collected
safety data
-Categorized
-IRC risk
index values

Safety Issue Risk


Assessment
- Global Risk Assessment

2009 Page 32

Data analysis should lead to the identification of Safety Issues, which


could further lead to action directly, or towards a Risk Assessment of the
Safety Issue (SIRA).

Page 32
Events vs. Safety Issues

• Risk Management is about managing Safety Issues


You cannot manage (historical) events
A Safety Issue usually links with several events

• Examples (fictitious):
Windshear at approach to XXX
Quality of de-icing in YYY
Operation into ZZZ (high-altitude, short runway, …)
Fatigue on red-eye flights

• You can Risk Assess Safety Issues because you can define &
scope them precisely

2009 Page 33

Safety Issues are the specific implications of various hazards in your


operation, detected through systematic Hazard Identification methods.

They evolve in time, old ones disappear and new ones emerge. For
example, high fuel price makes companies try fuel saving through new
procedures, which may introduce new Safety Issues.

Safety Issues can be precisely defined, which makes the eventual Risk
Assessment clear, transparent and credible.

Page 33
Conceptual framework for Risk Assessment
mx
T Air collision
EN
ops

OID
EV

Rwy overrun

R
PR

AV
ground

ES
VE
Undesirable
event

SS
CO
HAZARDS, SI’s ACCIDENTS

LO
RE
atc Ground
collision

ZE
wx

IMI
CFIT

MIN
HAZARD FREQUENCY
AVOIDANCE BARRIERS
RECOVERY BARRIERS
ACCIDENT SEVERITY

2009 Page 34

Safety Issues are risk-assessed regularly through SIRA (Safety Issue Risk
Assessment). A vital starting point is a proper conceptual framework for
such an Assessment.

To address the problems of older methods (lack of including the impact of


barriers; and the confusion on severity/frequency of what), an extended
framework was adopted. This framework has four factors: the frequency of
the initial hazard, the effectiveness of the avoidance barriers, the
effectiveness of the recovery barriers, and finally the accident severity.

For example, the initial hazard could be a maintenance error affecting the
braking system and the accident scenario a runway overrun. The
Undesirable Event is the point in time marking the transition from
Avoidance to Recovery, which in this case could be defined as landing on
a runway where the brakes are needed. Avoidance would be anything
allowing the detection of the maintenance error before the landing and
recovery would be a safe landing despite the problem (which might be
impossible, I.e. sometimes there are no recovery (or avoidance) barriers).

In the assessment, each of the four factors is given a qualitative or


quantitative value and the result is compared to risk tolerability criteria.

Page 34
Safety Issue Risk Assessment (SIRA)

• A value is estimated for each of the 4 factors:


Frequency of the initial hazard
Avoidance barriers
Recovery barriers
Severity of the most probable accident outcome

• As a result, we get the acceptability of the risk.

• JAR/FAR 25-1309 is used in building the method, to define


the acceptable combinations of likelihood and accident
outcomes.

2009 Page 35

JAR/FAR-25.1309 says, for example that catastrophic outcomes are


acceptable only at 10-9 probability. This can be used to calibrate colors in
the matrix, i.e. calibrate the tolerability of various combinations of severity
and probability.

The actual method for SIRA can be constructed in many different ways.
As input, there are the values for the four factors, and as output the risk
level. One can think of a simple excel-application, or an approach based
on two sub-matrices feeding to a final tolerability matrix, as presented on
the next slides.

Page 35
Safety Issue Risk Assessment (SIRA)
1. How frequent is the initial hazard (per sector)?
10-4 2 3 4 5 5

10-5 1 2 3 4
4

10-6 1 1 2 3
3

10-7 1 1 1 2
2

2. How often do barriers fail to AVOID


1
the Undesirable Event?
10-3 10-2 10-1 1 A B C D E

3. How often do barriers fail to RECOVER


From the Undesirable Event?

B C D E Catastrophic

A B C D Major 4. Most probable accident


A A B C Minor scenario

2009
A A A B Negligible
Page 36 C
The first matrix contains the first two factors (frequency & avoidance
barriers). Here the calculation is done per flight sector, but this aspect can
naturally be customized. The barriers are like filters, through which a
certain fraction of events pass. The task is to estimate is the fraction
rather 1/1000, 1/100, 1/10 or virtually always.

The second matrix uses the same scale for recovery barriers and then
integrates the accident severity. Here the reference is again the most
probable accident scenario. From its conceptual content, this second
matrix is actually similar to the ERC matrix.

The results of the two matrices are fed to the final matrix, which gives the
tolerability of the risk.

Page 36
SIRA - Example
1. How frequent is the initial hazard (per sector)?
Safety Issue:
10-4 2 3 4 5
• Risk of runway overrun
at any airport in the 10-5 1 2 3 4

current route network


10-6 1 1 2 3
including typical
alternate airports 10-7 1 1 1 2

• Due to poor braking 2. How often do barriers fail to AVOID


caused by the Undesirable Event?
maintenance error XYZ
10-3 10-2 10-1 1
• Applicable to A/C types
3. How often do barriers fail to RECOVER
A, B, C.
From the Undesirable Event?
• Time period: winter
operation 2008-2009. B C D E Catastrophic

A B C D Major

A A B C Minor

2009
A A A B Negligible
Page 37 C
This example illustrates using the SIRA for the above described Safety
Issue. It cannot be stressed enough, that the base for a good assessment
is in a very precise definition of the Safety Issue. Such definitions allow the
assessment to be based on facts, rather than hazy assumptions. For
example, when the applicable airports have been defined, one can work
on actual runway length data.

In this example, it is imagined that the maintenance error happens roughly


once per 10000 sectors, and that there are no reliable avoidance barriers,
because this maintenance error only becomes visible when maximum
braking is used. Recovery is estimated 9 times out of 10 (1 case out of 10
fails). Accident is estimated as “major”, i.e. rather injuries than fatalities.

Page 37
SIRA – Example (cont’d)

5 Stop

4
Improve
3
Secure
2

Monitor
1

A B C D E Accept

Note:
• Another SIRA application uses Excel instead of the
2009
intermediate matrices.
Page 38 C
The result on the final matrix would be on the red, meaning that this part of
the operation needs to be stopped immediately – this risk cannot be
tolerated at all.

This way of applying the SIRA with matrices is very visual, but introduces
some limitations in the range of values on the axes. For example, how to
cover hazards more frequent than 10-4? Calibrating the tolerability can
also become quite a demanding exercise and in some cases too
conservative compared to the JAR/FAR-25.1309.

An excel-based (or similar) application can be even more flexible for


implementing the SIRA. An example file is under construction.

Page 38
Conceptual difference between ERC and SIRA
mx
T Air collision
EN
ops

OID
EV

Rwy overrun

R
PR

AV
ground

ES
VE
Undesirable
event

SS
CO
HAZARDS, SI’s ACCIDENTS

LO
RE
atc Ground
collision

ZE
wx

IMI
CFIT

MIN
“How concerning was this event?”
ERC
ERC
ERC

SIRA

What is the risk of this Safety Issue (=these types of events)


2009
to our operation (today, tomorrow)? Page 39

This slide illustrates the conceptual difference between ERC and SIRA.

In ERC, the historical event may (or may not) have reached the level of
the Undesirable Event, or escalated even further. Therefore, some
barriers have typically already been breached, they are history and we do
not care about them. What counts are the barriers that were still in place in
the historical event. We assess the risk present there and then.

In SIRA, we consider today and tomorrow. We assess how high a risk is


presented by a particular Safety Issue. Therefore, a priori, all the barriers
are still in place and not breached. We need to consider the hazards, the
barriers and the accident severity by estimating a value for all of the 4
factors in the SIRA.

Page 39
Hazard Identification data Operational Risk Profile

Planned changes
RA Associated Risk

RA of Future Risks:
• Hazard
Analysis:what
could go wrong?
• Risk Assess
identified threats
as Safety Issues

2009 Page 40

Let’s come come back to the objectives for the Risk Assessment
methodology for a while.

So far we have seen that the ARMS methodology can digest Hazard
Identification data and transfer it into an Operational Risk Profile. This is
done through plotting Safety Issues on a “risk map” using the SIRA
values, and also based on statistics using the ERC risk index values.

But what about the planned changes and the associated Future Risks?

The ARMS methodology addresses such Safety Assessments too.

Page 40
Safety
Events

30 100 300 1000

10 30 100 300

3 10 30 100

Investigations

All Data

Actions to
Data Analysis Safety
-Frequencies reduce risk
-Trends Performance
-Identification of Safety Monitoring
Issues

All collected
safety data
-Categorized
-IRC values

Hazard Safety Issue Risk


Analysis Assessment
Plan to make - Global Risk Assessment
a significant
change.

2009
2008-J.Nisula/Airbus Page 41

This type of Future Risk assessments start in the bottom left corner of the
process chart.

The first step is to analyze the hazards associated with the change. There
are various systematic techniques for this. These are beyond the ARMS
process and are not discussed here.

Once the hazards have been identified, they can be formulated as Safety
Issues and fed into the same SIRA as used earlier.

Page 41
ARMS Methodology

1. Do we need a New Methodology?

2. The ARMS Mission

3. The two levels of ARMS Deliverables

4. The ARMS Methodology

5. Risk Management in the organizational context

2009 Page 42

Page 42
Safety Accountability and Safety Delivery
Board of Directors

nc y
ACCOUNTABILITY

c CEO
su ren

e
As pa
ra
Safety Review Board
s

CEO
fe tran
sk
ty
Ri

COO
Sa

COO Qty Corporate SAG

Qty Mgr
Safety Mgr
Mgr
DELIVERY

Safety Mgr

t
ag sis
en
Postholders & Mgt team

y
em
M nal
Postholders A
an
sk

Mgt team
Ri

Local SAG’s ?
y t
fe
Sa

2009 Page 43

This presentation has focused on the Risk Assessment Methodology. To


close the loop regarding Risk Management, let’s take a brief look into the
organizational aspects of Risk Management.

The fundamental split is between the Top Management and the rest of the
organization – the former having the Safety Accountability and the latter
being responsible for the Safety Delivery.

The Risk review and action managing tasks at different levels of the
organization are managed through the Safety Review Board (SRB) and
one or more Safety Action Groups (SAG) – sometimes called Safety
Committees.

The Safety Manager is not accountable for the Safety Performance, but
responsible for the Safety Management System itself.

Page 43
Roles and organization

• Top Management – SAFETY ACCOUNTABILITY


CEO, COO
Safety Review Board (SRB)

Monitoring Safety Performance


Demanding and contributing to high safety performance
Making decisions on what is acceptable in terms of risk and
signing them off
Providing necessary decision power when needed
Contributing to and deploying the Safety Plan (targets)
Participating in safety communications
Providing Safety visibility to the Regulator

2009 Page 44

The quality of Risk-based information greatly influences the ability of the


Top Management to form a reliable overall picture of Operational Risks
and make informed decisions on the acceptability of risks.

The quality of the Risk-based information relies on the data produced by


Hazard Identification and the Risk Assessment Methodology.

Page 44
Roles and organization

• Others – SAFETY MANAGEMENT & DELIVERY


Postholders / Directors:
– Safety responsibility at their level
– Participate in SAG and SRB
Safety Manager:
– Responsible for the Safety Management System
– Expert, gives advice
Quality managers

Hazard Identification
Tools, methods
Risk Assessment
Expertise
Ensuring safety actions
SMS quality and evolution
2009 Page 45

The operational management and other operational people need


information on risks that are present in their work and on risks that they
are responsible for.

Again, the methodology has a high impact on the capability to produce


useful and up-to-date risk information to guide operational people.

Page 45
Conclusion

• This presentation has covered the new Risk Assessment


Methodology created by the ARMS Working Group

• The Methodology has been created by safety practitioners


from various aviation organizations and aims to be pragmatic
and useful, while remaining conceptually robust.

• The Methodology is available to the whole industry and is


hoped to deliver a significant improvement compared to older
methods.

2009 Page 46

Page 46

You might also like