The Top 10 Worst-Performing Alarm Systems in Industry
The Top 10 Worst-Performing Alarm Systems in Industry
The Top 10 Worst-Performing Alarm Systems in Industry
Alarm Management is the current “hot topic” Steps to implement based on alarm system
in the process industries. Overloaded and performance after the first 3 steps:
poorly performing DCS alarm systems are Step 4: Perform Alarm Documentation and
common and have been identified as contributing Rationalization (D&R)
factors in several major accidents including
those at BP Texas City in 2005 and Texaco Step 5: Implement Alarm Audit and
Pembroke refinery in the UK in 1994. Enforcement Technology
Step 6: Implement Real Time Alarm Management
To improve an alarm system, it is essential to — and, of course —
perform an initial benchmark. Benchmarking a
system has many benefits. It provides a basis to Step 7: Control and Maintain Your Improved System
compare a system against industry best practic- The following examples are taken from analy-
es as well as a reference point to measure im- ses from many different process industries,
provements at the end of an alarm management including Refining, Petrochemical, Power, and
project. Other benefits include creating solid Pulp & Paper. The facilities are located all over
data driven analysis to communicate the state the world. Of course, the specifics are omitted
of the alarm system to appropriate stakeholders for confidentiality reasons. In every case, there
at a site, and justify further investment in alarm are many “close runners-up” to the “winners”
system improvement. Finally, a significant ben- shown, as these are very common problems,
efit in a benchmark study is the identification spread throughout the biggest names in the
of bad acting alarms. As a standard practice, processing industry.
PAS identifies bad acting alarms in the initial
benchmark report. Our experience indicates All of the analyses shown are based upon the
that breakthrough gains can be realized simply span of control of a single board operating po-
by resolving the bad acting alarms. sition. Later. when we mention 3,517 alarms
in ten minutes, we will be referring to a quan-
There are several different alarm problems to tity of alarms presented to a single person, not
examine, with differing solutions. In perform- multiple people.
ing these analyses, some amazing phenomena
have been documented, presented here as And now, the top 10 worst-performing
examples of how bad things can get. alarm systems.
The solution to alarm problems can be Number 10: Worst Diagnostic
achieved by following the seven-step process
Alarm Percentage
developed by PAS. The methodology and all
other aspects of alarm management are de- This measure is the extent to which alarms in-
tailed in The Alarm Management Handbook dicating malfunctioning instruments are a per-
– A Comprehensive Guide, which is available centage of the overall alarm load. A high abso-
at www.PAS.com and on www.Amazon.com. lute count of such alarms indicates significant
The implications of, and solutions to these maintenance problems with the instruments.
problems are presented in much more detail A high percentage of instrument diagnostic
than can be accomplished in this brief paper. alarms indicates that important process alarms
The book is intentionally designed with a are likely to be “buried” in the alarms from the
very “how-to” focus. malfunctioning instruments. Our “winner” in
this category has both high counts and high
Here are the seven steps to a highly effective
percentages – the alarm system is dominated
alarm management system:
by BAD VALUE Alarms.
Always-needed initial steps:
Worst Diagnostic Alarm Percentage Modern sensors can generally provide all of
the accuracy needed over the entire range
71% of the entire annunciated alarm system that the process is likely to vary. But some
load is from instrument malfunction alarms. engineers continue to follow the older con-
They averaged more than 600 such alarms per figuration practices and do not consider the
day during the 24-day analysis. consequences of generating lots of Bad Value
alarms during conditions such as startup and
shutdown. The correct practice is to configure
the instrument for the entire possible range of
the process value under all conditions, the ac-
curacy obtained then checked, and if needed,
a better sensor specified.
Figure #10: Worst Diagnostic Alarm Percentage Generally the addition of a new instrument
must follow a management-of-change meth-
Commentary and Solutions: odology, to ensure it is done properly. So
does the removal of an instrument, to ensure
It is surprising to see the amount of “bad
that it is truly not needed and the removal is
measurement” alarm events on most systems.
done properly. And functionally, the indefinite
These are often in the hundreds or thousands. If
toleration of a malfunctioning instrument is
the best control engineers in the company had
the same as removing it. If there is an incident,
been specifically asked to design instruments
it will be difficult to explain how a relevant
that would have such poor performance, it is
instrument was allowed to malfunction for
unlikely that they could have done it! Yet, we
months – to effectively be removed from ser-
find these on almost every system we analyze.
vice – without the appropriate level of review.
Since no instrument was designed to be in such This is the stuff of fines and lawsuits.
a state, every one of these situations can be
fixed, and should not be “just tolerated” – as Number 9: Worst Nuisance
is often the case. They are misconfigured in Alarm Percentage
range, in “measurement clamping,” or there is
We usually find that only 10 to 20 different
an installation problem (impulse leads filling up,
configured alarms make up from 20% to 80% or
etc.) The original justification for installing a flow
more of all alarm events in a system. Those most
meter probably did not include a specification
frequent alarms were not originally designed to
that it was OK if it didn’t work half of the time! If
annunciate hundreds (or thousands) of times per
that had been proposed, the money would have
day, but they do. These alarms are called “nui-
never been spent to buy it in the first place.
sance” as they deliver no value to the operator.
For example, a typical problem we see in- In fact, their high rate of occurrence becomes a
volves out-of-range alarms from transmitters. hindrance to the operator’s ability to identify im-
Long ago, the available instrument sensors had portant alarms during a process upset. Address-
a significant tradeoff between accuracy (signifi- ing them and making them work properly will
cant digits) and range; you could obtain high substantially improve an alarm system, provide
accuracy only over a small range, probably less immediate and much-needed relief to the opera-
than the possible variation of the process. tor, and is not difficult or very time consuming to
do. Our winner in this category:
EMPOWERING PEOPLE. DRIVING ASSETS.
Again, in this system, half of the top 10 alarms DCS alarms systems are notoriously easy to
are related to instrument malfunction, previ- change, and inadequate control over such
ously discussed. The others are from a pressure changes is common. Security settings in most
switch, a flow meter, and command-disagree DCSs are insufficiently granular to allow op-
signals from several motor-operated valves. erators to make the kinds of changes that are
Commonly, we see every possible alarm type needed, yet restrict them from making inap-
in the “top 10.” propriate alarm system changes. The response
to a nuisance alarm is often to suppress it. We
Chattering and fleeting alarms are where the have seen critical alarms disabled for months,
alarm appears and clears faster than would be with no records, no approvals, no repair ef-
possible by the application of an appropriate forts, and no other actions taken. Paper-based
operator action. The top 10 most frequent alarms management-of-control systems are rarely
usually contain several alarms that chatter or are effective. The practice of uncontrolled alarm
fleeting. These can be addressed in a variety of suppression is highly dangerous, and unfortu-
ways. First, the requirement to have the alarm and nately common.
a proper alarm trip point (relative to the normal
variation of the process) should be confirmed. A very common DCS, for example, has a
Also needed and effective are proper deadband much-abused suppression setting called DIS-
selection (if an analog signal, or proper “mechani- ABLE. When this is used, the alarm event is
cal deadband adjustment” of field switches), and still produced for the electronic journal, but
the proper application of alarm delay times. not annunciated to the operator. Thus, you can
ON-Delay times prevent alarms from being an- analyze the count of alarm events that have
nunciated to the operator until they have been in been suppressed vs. those the operator sees.
effect for a certain number of seconds, which can Our winner in this category is:
eliminate most chattering or “fleeting” alarms.
OFF-Delay times do not delay the initial presenta-
tion, but instead turn a string of chattering alarms
into a longer-duration single alarm event.
EMPOWERING PEOPLE. DRIVING ASSETS.
Worst Alarm Daily Rates Including over a long analysis period. The intent is to
Suppressed Alarms show that these rates are not “aberrations” –
they are sustained conditions.
• Worst Average Daily Rate: 26,665 alarms
per day (recorded – 1 every 3 seconds)
• Worst Daily Total: 48,803 in one day (>1
every 2 seconds)
• Based on an 18-day analysis period
Commentary:
Our winner very steadily produced an alarm It was the installation of our PAS alarm
rate averaging over 1,400 alarms per 10 min- analysis software, which successfully
utes for a period of 41 hours (over 200,000 in captured and analyzed this event data that
a single day). This is 142 alarms per minute, identified this problem and led to the proper
or about 2.3 per second. In combination with diagnosis and repair.
three other nuisance alarms, this system pro-
duced a single-day peak of 208,311 alarms, • Average: 173,657 alarms per day
the highest single-day total we have encoun- • Peak: 180,488 alarms per day (> 2 per second)
tered (although not by much).
Summary: