Understanding Human Downtime
Understanding Human Downtime
Understanding Human Downtime
Author: Terry Vergon (Source: http://www.sapientservicesllc.com/Services.html) In my thirty or so years of working in mission critical facilities, I have studied and investigated many incidents involving human-caused downtime. Most of these incidents fall into five major groupings all preventable. Communication Errors Spoken communication is tough. If you dont believe it, just ask the people working on Siri, Dragon, or other speech recognition software. Local slang, vernacular, pronunciations, and meanings can add confusion and misunderstanding. When I reported to my first submarine in the Navy, there were announcements being made over the boats PA system that I didnt understand for a couple of weeks. Usage of abbreviations, local designations, and speed announcing made it difficult to understand. Another problem I noticed: For those that had been there for some time, the announcements actually faded into the background noiseanother very dangerous situation, especially since these were important safety announcements. Have you ever listened to a song on the radio and then later realize the actual lyrics were something entirely different than what you thought? Our minds can play tricks on us. Oftentimes, we hear what we want to hear or expect to hear (Hearing What We Want to Hear, 4/1997, Chenausky). Add to that communications that are not clearsuch as using letters like C, B, and D within spoken operational orders and you start to appreciate the complexities that we interject into our communications. How we communicate can add risk to our operations. How to prevent: Develop, train, implement, and enforce a formalized spoken communication protocol with mandatory repeat-backs. Eliminate confusing designations; use a phonetic alphabet for your sites letter designations. Provide all personnel with list of authorized abbreviations for use at the site. Leaders must enforce and become examples of this policy and practice. Inattention Inattention can be caused by many things fatigue, emotional state, intoxication, medical condition, local distractions (sounds, sights, other personnel, etc.). Ever been driving a car and felt your head snap up as you were driving late at night and realize that you just took a nap at 70 mph? Has your mind ever wandered during a conversation to where you literally cant remember what was just said and have to ask the speaker to repeat themselves for you? Our minds naturally wander. Its what we do. Our brains process information much faster than is being received. This gives the brain extra time to access memories, process relationships, and try to make sense of what it just received. Sometimes this processing activity can take control and we daydream or lose focus. Whatever you call it, it can
Page 1 of 3 Date Issued: March 8, 2012 File:87650800.docx
cause the brain to focus on something other than that which is critical for proper operation, communication, or observation. However momentary it may be, it can create real problems. How to prevent: Require that the supervisor for that shift/period make an assessment of each of the operational staff for fitness for duty. This doesnt need to be a formal interview, but a lot can be discovered by seeing each person and just asking a few questions. Having each person provide a turnover status for each of their areas at a preshift turnover meeting could satisfy this. Send anyone home that isnt ready for the rigors of critical operations. For activities that are critical to the safety and continued operation of the site, make it a practice that those activities must be accomplished by two people. This practice is used by the military, nuclear, airlines, and other mission critical environments. Having a second person checking actions and reading the procedure can prevent mistakes and make the activity interactive, reducing the risk of inattention. Documentation Errors
Ever follow your vehicles GPS to a dead end or someplace where you cannot get to your destination because the roads have changed? You have been victim of a documentation error. Its the same for the operations staff that uses outdated or incorrect documentation for the activity being performed. Using a drawing that has not been updated since the last system upgrades were done, using a procedure that is not up to date or actually doesnt work are examples of documentation errors. How to prevent: Develop and implement a formalized documentation control program. Do not allow operation of your plant with anything other than controlled documents. Implement a formalized process to validate your procedures. Ensure that sufficient information which is in the correct sequence is provided for the operations team to successfully complete the activity. I recommend formal engineering development of all critical activity procedures and processes. Incorrect/Missing Labeling Imagine being in a large steel tank, with pipes and literally hundreds of valves. Then imagine water gushing into that tank. Now imagine you have about 2 minutes to discover which valve is the water valve that shuts off the flow of water into the tank before you drown (some of you will recognize this scenario from submarine school training). Oh, by the way, none of the valves are labeled or color coded. It would have been nice to find the valve that said water shut-off valve. In training, they never make it that simple. In critical environments, when we are asked to perform an activity, it is vital to know that we are operating the correct valves and switches in accordance with procedure. The procedures need to have valve/switch designations that match exactly the labeling in the plant. I have seen a simple cable labeling error shut down an operating nuclear power plant.
Page 2 of 3 Date Issued: March 8, 2012 File:87650800.docx
How to prevent: Implement a plant-standardized labeling and color-coding program. If you have a procedure to operate it, it needs to be labeled or coded as it appears in the procedure. A great practice is to provide a schematic of electrical circuits or flow path on the equipment itself to aid in operator understanding. Properly done, every switch, valve, or operator will be uniquely labeled and easily understood. Lack Of System/Process Understanding
While working at a laboratory, I observed a lab technician recount a standard sample several times. She stated that she had to recount the standard sometimes as many as seven to ten times to get an acceptable reading from the radiation measuring device. It was about then I administratively shut down the lab. The lab technician was invalidating a statistical process to verify the radiation measuring device was operating correctly! The ramification was that the lab was potentially releasing radioactive materials into the general public. You can imagine the response that this caused. Every sample that this machine was used on had to be re-analyzed, the public was notified, and the incident literally made the evenings national news, all because of a technicians lack of understanding of the process. You can have operators following procedures verbatim, but if they dont understand the expected system responses, they can misinterpret what is happening, with resultant downtime or worse. Incidents that to some degree fall into this category are Three Mile Island, Bhopal, and the Challenger disaster. How to prevent: Training, training, training. This is a solution that is not all that easy to implement within restrained budgets, limited training resources, and limited time, but there is no other way to fix this. There are some methods to stretch your training resources, but it must be done one way or another. The training program needs to implement some form of refresher or re-certification process along with lessons learned from plant operational experiences. I hope that this article provides some insight to human-caused downtime incidents. The prevention methods that I have listed have been used for years and proven in many mission critical environments to prevent human-caused incidents. I hope that you can use some of this in your facilities and Im always open to new ideas on how to prevent human-caused incidents. Posted by Terry's Blog at 11/18/2011 3:00 AM Categories: Incidents/Downtime Tags: downtime Mission Critical communications human caused downtime incidents documentation labeling training