IBTM553 Lecture 5 2007 PDF
IBTM553 Lecture 5 2007 PDF
IBTM553 Lecture 5 2007 PDF
Main Menu
Course Program Fault Tree Basics Event Tree Methodology Integrated FT/ET Analysis Conclusions
Course Program
Date 1 2 3 4 5 6 7 Sep-8 Room 2464 Sep-15 Room 2464 Sep-22 Room 2464 Sep-29 Room 2464 Oct-6 Room 2464 Oct-13 Room 2464 Oct-20 Room 2464 Contents Overview of course contents and schedule Definition of hazard and risk, risk management principles Individual risk, collective risk and societal risk, risk management indices, background risk, risk acceptance criteria Risk Assessment I: Hazard analyses: techniques and applications, risk matrix Risk Assessment II: Concept of frequency, probability and uncertainty, Bayesian data update Risk Assessment III: Quantitative risk assessment, fault tree, data analysis, event tree Mid-term exam Cost/risk-benefit analysis I
Recap of Lecture-3
Objective
Understand the differences between hazard and risk Be familiar with typical hazard identification tools Be able to apply the concept of risk matrices
Recap of Lecture-3
Application of Worksheets
Identify hazard scenarios systematically
Use systems, subsystems, job steps, etc.
Recap of Lecture-3
Mitigation Measures in order of adoption
(a) eliminate the hazard (e.g., physically remove the hazard, design change); (b) substitute the hazard with a safe alternative (e.g., replace a hazardous material with a safe material); (c) prevent exposure of personnel to the hazard; (d) use of active and/or passive safe guards, minimise failure of safe guards with redundancy (e.g., install safety barriers or warning devices) and/or special procedure and administration control; (e) use of personal protection equipment; (f) develop response plan to reduce the consequence; (g) conduct focused training to improve the competency of staff and reduce human errors (this should not be the only control measure for high risk hazards); and (h) accept the hazard and monitor the hazard continuously (this should not be the only control measure for high risk hazards).
Recap of Lecture-3
Filling the Worksheet
Be creative but realistic and comprehensive Each row is one scenario that gives one set of H/M/L or F/S/R; if you lump different scenarios together, hard to justify H/M/L or F/S/R Be able to know what you are talking about by you and others years later Show existing and proposal control measures, residual risks
Recap of Lecture-3
Assigning Risk Matrices
Must tell people what risk matrices you are using Be consistent - If you use H/M/L type simple matrix, do not use Frequency and Severity classes If you use Frequency and Severity classes, you must show F/S/R explicitly for each scenario If your tables have numeric scores, you must show scores
Recap of Lecture-3
Common Mistakes
Haza rd ID.
Mix up risk matrices, if use F/S/R must show all 3 values Did not show residual risk Mix up potential cause and hazard scenarios, do not follow notes 3B Scenario description not concise, not comprehensive Provide PPE is not the best bet
Hazard Description Potential Cause Consequence Existing Control Measure F Original Risk Residual Risk Proposed Control Measure C R F C R Comment
The worker dropped the electrical into water due to carelessness resulting in electrical shorts to the metal ladder
none
F C A Drain water 2 4 before work, use rechargeable drill if possible or GFCI protected circuit
F C C 4 2
Recap of Lecture-4
Objectives
Understand the basic concept of probability and frequency Understand the different types of uncertainties in risk analyses Be familiar with Bayes Theorem
Recap of Lecture-4
Probability - Basic Concept
Sample Space
The set of all possible outcomes without the problem area Everything else is irrelevant
Events
Outcomes, or trials Combinations of decisions and outcomes
Throw a dice = {1}, {2}, {3}, {4}, {5}, {6}, 2 Children in family = {B1,G2}, {B1,B2}, {G1,B2}, {G1,G2}
Recap of Lecture-4
The Laws of Probability
If A and B are not mutually exclusive: P(A or B) = P(A) + P(B) P(A and B)
A B
Recap of Lecture-4
The Laws of Probability Law of Multiplication What is the probability that both A and B occur together? P (A and B) = P(A B) = P(B|A) P(A) where P(B|A) is the probability of B conditioned on A; is and, | is given If A and B are statistically independent: P(B|A) = P(B) and then P(A B) = P(A) P(B) Most people would only remember P(A B) = P(A) P(B)
Recap of Lecture-4
Conditional Probability and Bayes Theorem
Conditional probability is DEFINED as:
P( A | B) =
Probability of Event A, given Event B, is the probability of Event (A and B) divided by the probability of Event B alone Equivalently, P(A B) = P(A|B) P(B) = P(B|A) P(A) Where is and, | is given
P( A B) P( B)
A AB
Recap of Lecture-4
Bayes Theorem for a Given Parameter
The prior is the probability of the parameter and represents what was thought before seeing the data The likelihood is the probability of the data given the parameter and represents the data now available The posterior represents what is thought given both prior information and the data just seen It relates the conditional density of a parameter (posterior probability) with its unconditional density (prior, since depends on information present before the experiment) p (jdata) = p (dataj) p (j) / p (datai) p (i)
B1 B2
P(B | A) =
1/4
C D is the same as G1 G2
Recap of Lecture-4
Problem Solving #3 In a bolt factory machines A, B and C manufacture, respectively, 25, 35 and 40 percent of the total. Of their output 5, 4 and 2 percent are defective bolts. A bolt is drawn at random from the produce and is found defective. What are the possibilities that it was manufactured by machines A, B, C?
A Output Defective P(A)= 0.25 P(d|A)=0.05 B P(B)= 0.35 P(d|B)=0.04 C P(C)= 0.40 P(d|C)=0.02
Recap of Lecture-4
Problem Solving #3
A Output Defective P(A)= 0.25 P(d|A)=0.05 B P(B)= 0.35 P(d|B)=0.04 C P(C)= 0.40 P(d|C)=0.02
P(A | d) =
P(d) = P(d,A) + P(d,B) + P(d,C) =P(d|A) P(A) + P(d|B)P(B) + P(d|C)P(C) =0.05*0.25 + 0.04*0.35 + 0.02*0.40 = 0.0345
Recap of Lecture-4
Problem Solving #3
A Output Defective P(d) P(A)= 0.25 P(d|A)=0.05
0.25*0.05=0.0125
=1
P(A | d) =
P(A|d) = 0.0125/0.0345 = 0.36 P(B|d) = 0.014/0.0345 = 0.41 P(C|d) = 0.008/0.0345 = 0.23 P(A|d) + P(B|d) + P(C|d) = 1
Recap of Lecture-4
Law of Total Probability B1, B2, , Bn = mutually exclusive and exhaustive events. P(A) = P(A|Bi ) P(Bi )
i =1 n
Given: P(Bi ), P(A|Bi ) i =1, 2, ..., n Want: P(Bi |A) P(Bi |A) = P(A|Bi ) P(Bi )
j =1
Lecture-5
Learning Objectives
Suggested Readings:
Lecture-5 Supplementary Notes
Series System 1
The system is operative if every component is operative. Ai = Componenti is operative. A = The system is operative. Assume: A1, , An are mutually independent P(A) = P (A1 A2 An) = P (A1) P(A2) P(An) Note: P(A) < min {P(A1), , P(An)}, i.e., the system is weaker than its weakest link.
Parallel System 1 2
The system is operative if any one component is operative. Ai = Component i is operative. A = System is operative. A1, A2, , An are mutually independent. P() = P(1 2 n) = P(1) P(2) P(n) = (1-P(A1)) (1-P(A2)) (1-(An))
P(A) = 1 (1-P(Ai ))
4
b
The system operates if there is at least one path of operating components from a to b.
Ri = P {Component i operates} R = P {System operates} = 1 (1 R1R4) (1 R3R5)
Many symbols and styles, we stay with the simple ones here
TOP
Calculations
Exact:
P(Top) = P(A) x P(B) x [P(C) + P(D) P(C)xP(D)] P(Top) 0.1x0.1x(0.1+0.2 0.1x0.2) = 0.0028
A 0.1
B C 0.1 0.1
D 0.2
(B + C + F(D + E))A
B + C + F(D + E)
A B+C
F(D + E)
(D +E)
pump fails
regulator fails
OR
F
fire process temp increases
F= AB + AC + AFD + AFE
Alarms Fail
Measurement of Likelihood
Typically use generic frequency or rates Should use specific data (past failure records) with consideration of generic data Can use expert judgment for rare events must handle degree of belief; i.e., uncertainties Can be a discrete value (like those in a risk matrix) or a continuous function
Main Menu
Frequency
Frequency is a measure of the rate of occurrence. E.g., failure rate of a pump is 6.2x10-3/hr Frequency data are based on statistics with consideration of uncertainties (probability); e.g., the failure rate of a pump is 6.2x10-3/hr. But it could be Frequency Fraction Product
1.0x10-4/hr 2.0x10-3/hr 3.2x10-3/hr 4.5x10-2/hr 0.2 0.5 0.2 0.1 Sum: 2.0x10-5/hr 1.0x10-3/hr 6.4x10-4/hr 4.5x10-3/hr 6.2x10-3/hr
Event Trees
Use inductive logic to postulate and quantify accident scenarios or accident sequences Start with initiating event and follow through scenario to identify possible scenarios which need to be managed Event trees should be used to display the progression of an accident A typical event tree in a nuclear power plant risk analysis may generate millions of accident sequences
SAFE
DAMAGE
Damage State
Event Tree
Event headings are usually state or system, function of safety barriers, actions or events that can alter the course of the accident scenario Easier if you put key actions first Event tree and fault tree are interchangeable in most cases
Example
Detector exists Detector works
Y Y 0. 2 0.99
Detector noticed
Escape
Y 0.9 N 0.1
rescued
Consequence
OK
Probability
0.1782 0.0099 0.0099
Y 0.5 N 0.5
OK
Fire N 0.01
OK OK
N 0.8
OK Y 0.2 N 0.8 OK
Another example
Pressure Tank
Again, Not a good practice
Steps In A QRA
Typical, a PRA consists of four steps
Risk identification Risk evaluation Risk management Risk communication
The four steps are equally important and are often iteratively applied in phases These generic steps are applicable to ALL risk assessments
PLANT
FAULT TREE
Valve Failure
0.1
Test & Maintenance Unavailability
Pump Failure
0.3
Failure to Start or STBY Failure Rate
0.1
Failure to Run
0.1
0.1
0.1
Event B
Event N
End State S1 S2 SN -1 SN
Success Fail
Event Trees were used to postulate accident sequences and quantify the Frequency of each sequence FS|IEi are conditional probabilities quantified by fault tree analysis or engineering calculations
Basic Event a
Basic Event b
Basic Event n
The Consequence is assessed by the consideration of the failure scenario. May not be as simple as Safe/Unsafe. Can be many states of failure
System Level Analysis (Event Tree and Fault Tree Analysis, etc.)
Event B
Event N
QRA Applications
Success Fail
System and Subsystem Level Analysis (Fault Tree Analysis, SCA, etc.)
Basic Event a
Basic Event b
Basic Event n
Data Analysis (initiating event frequency, component failure rate and consequence modelling)
Reliability Test Data Supplier Data KCRC Data HK Data Generic Railroad Data Expert Judgment
Readings
Supplementary Notes Practice problems
END
Practice Problems: (1) Draw a fault tree with top event No Output from Reactor. Assuming that either Compressor I or II can supply adequate amount of compressed air to Drier.
(2) Draw a fault tree with top event Latch does not Trip. Assuming that the Latch will trip if the linkage can be driven by either hydraulic system. The linkage may break.
(5) Write down the probability value of each sequence. Given failure probability of System B, C, and D are PB, PC and PD, respectively
B
A
Success
D
1 2 3 4
Failure
(2)
(3)
No Output from System
(4)
(6). If you can finish Questions 1 to 5, you should know the answer for Question 6!
Fault Tree Analysis (FTA) is another technique for reliability and safety analysis. Bell Telephone Laboratories developed the concept in 1962 for the U.S. Air Force for use with the Minuteman system. It was later adopted and extensively applied by the Boeing Company. Fault tree analysis is one of many symbolic "analytical logic techniques" found in operations research and in system reliability. Other techniques include Reliability Block Diagrams (RBDs).
Fault tree diagrams (or negative analytical trees) are logic block diagrams that display the state of a system (top event) in terms of the states of its components (basic events). Like reliability block diagrams (RBDs), fault tree diagrams are also a graphical design technique, and as such provide an alternative to methodology to RBDs. An FTD is built top-down and in term of events rather than blocks. It uses a graphic "model" of the pathways within a system that can lead to a foreseeable, undesirable loss event (or a failure). The pathways interconnect contributory events and conditions, using standard logic symbols (AND, OR etc). The basic constructs in a fault tree diagram are gates and events, where the events have an identical meaning as a block in an RBD and the gates are the conditions.
The most fundamental difference between FTDs and RBDs is that in an RBD one is working in the "success space", and thus looks at system successes combinations, while in a fault tree one works in the "failure space" and looks at system failure combinations. Traditionally, fault trees have been used to access fixed probabilities (i.e. each event that comprises the tree has a fixed probability of occurring) while RBDs may have included time-varying distributions for the success (reliability equation) and other properties, such as repair/restoration distributions.
Drawing Fault Trees: Gates and Events Fault trees are built using gates and events (blocks). The two most commonly used gates in a fault tree are the AND and OR gates. As an example, consider two events (or blocks) comprising a Top Event (or a system). If occurrence of either event causes the top event to occur, then these events (blocks) are connected using an OR gate. Alternatively, if both events need to occur to cause the top event to occur, they are connected by an AND gate. As a visualization example, consider the simple case of a system comprised of two components, A and B, and where a failure of either component causes system failure. The system RBD is made up of two blocks in series (see RBD configurations), as shown next:
The fault tree diagram for this system includes two basic events connected to an OR gate (which is the "Top Event"). For the "Top Event" to occur, either A or B must happen. In other words, failure of A OR B causes the system to fail.
Relationships Between Fault Trees and RBDs In general (and with some specific exceptions), a fault tree can be easily converted to an RBD. However, it is generally more difficult to convert an RBD into a fault tree, especially if one allows for highly complex configurations. The following table shows gate symbols commonly used in fault tree diagrams and describes their relationship to an RBD. (The term "Classic Fault Tree" refers to the definitions as used in the Fault Tree Handbook (NUREG-0492) by the U.S. Nuclear Regulatory Commission).
Table 1: Classic Fault Tree Gates and their Traditional RBD Equivalents Name of Gate Classic FTA Symbol Description RBD Equivalent Simple Parallel Configuration [See Example] The output event occurs if at least one of the input events occurs. Series Configuration [See Example] k-out-of-n Parallel Configuration [See Example] Simple Parallel Configuration of all the The input event occurs if all input events plus the events occur and an additional condition conditional event occurs. [See Example] Standby Parallel Configuration (without a quiescent failure distribution)
AND
OR
Voting OR (kout-of-n)
Inhibit
Priority AND
The output event occurs if all input events occur in a specific sequence.
Dependency AND
The output event occurs if all input events occur, however the events are dependent, i.e. the occurrence of each event affects the probability of occurrence of the other events.
XOR
Cannot be represented and does not apply in terms of system reliability. In system reliability, this would imply that a twocomponent system would function even if both components have failed.
Table 2: RBD Constructs without a Traditional Fault Tree Equivalent Function FTA Equivalent Description Allows for modeling event dependency (or load sharing). The output event occurs if all input events occur, however the events are dependent, i.e. the occurrence of each event affects the probability of occurrence of the other events. Standby redundancy configurations consist of items that are inactive and available to be called into service when/if the active item fails (i.e. on standby). Items on standby can also fail (quiescent) while waiting to switch. RBD Equivalent
A Priority AND gate can be used. However, this does not account for quiescent failure probabilities.
Table 3: Traditional Fault Tree Gates without an RBD Equivalent Name of Gate Classic FTA Symbol Description The output event occurs if exactly one input event occurs. In a two component system the event does not occur if both or none of the inputs occur. XOR When modeling system reliability, this implies that the system is successful if none of the components fail or if all of the components fail. RBD Equivalent Cannot be represented and does not apply in terms of system reliability. In system reliability, this would imply that a twocomponent system would function even if both components have failed.
Events The gates in a fault tree are the logic symbols that interconnect contributory events and conditions. An event (or a condition) block in a fault tree is the same as a standard block in an RBD, in that it can have a probability of occurrence (or a distribution function). However, unlike traditional RBDs, where a single graphical representation is utilized to represent the block (or event), fault trees use several graphical block representations. Table 4 discusses these graphical representations.
Table 4: Traditional Fault Tree Event Symbols and their RBD Equivalents Primary Event Block Basic Event Classic FTA Symbol Description A basic initiating fault (or failure event). An event that is normally expected to occur. External Event (House Event) In general, these events can be set to occur or not occur, i.e. they have a fixed probability of 0 or 1. An event which is no further developed. It is a basic event that does not need further resolution. A specific condition or restriction that can apply to any gate. RBD Equivalent Block
Undeveloped Event
Block
Conditioning Event
Block: Placement of the block will vary depending on the gate applied to.
Table 5: Additional Fault Tree Constructs and their RBD Equivalents Primary Event Block Transfer Classic FTA Symbol Description Indicates a transfer continuation to a sub tree. RBD Equivalent Subdiagram Block
Example 1 A fault tree diagram with a Voting Gate and the RBD equivalent.
Example 2 Fault Trees and Complex RBDs: The best example of a complex reliability block diagram is the so called "bridge." The following RBD represents such a bridge.
Representation of this bridge as a fault tree diagram requires the utilization of duplicate events, since gates can only represent components in series and parallel. An inspection of this system reveals that any of the following failures will cause the system to fail: Failure Failure Failure Failure of of of of components components components components 1 3 1 2 and and and and 2. 4. 5 and 4. 5 and 3.
In probability terminology, we have: (1 And 2) Or (3 And 4) Or (1 And 5 And 4) Or (2 And 5 And 3).
These sets of events are also called minimal cut sets. It can now be seen how the fault tree can be created by representing the above set of events in the following fault tree.
Conversion of the above fault tree to an RBD (note that components with same name are mirrored blocks).
Fault Tree Analysis (FTA) is well recognized worldwide as an important tool for evaluating safety and reliability in system design, development, and operation. For more than 40 years, FTA has been used in the aerospace, nuclear, and transportation industries to translate the failure behavior of a system into a visual diagram that displays system relationships and root cause failure paths. A fault tree provides a concise, visual representation of the various combinations of possible occurrences within a system that can result in a predefined and undesirable event. FTA is most often used for:
Identifying safety critical components. Verifying product requirements. Certifying product reliability. Assessing product risk. Investigating accidents/incidents. Evaluating design changes. Displaying the causes and consequences of events. Identifying common-cause failures.
FTA is a deductive analysis method that begins with a general conclusion (a system-level undesirable event) and then attempts to determine the specific causes of this conclusion. Based on a set of rules and logic symbols from probability theory and Boolean algebra, FTA uses a top-down approach to generate a logic model that provides for both qualitative and quantitative evaluation of system reliability. The undesirable event at the system level is referred to as the top event. It generally represents a system failure mode or hazard for which predicted availability data is required. The lower level events in each branch of a fault tree are referred to as basic events. They represent hardware, software, and human failures for which the probability of failure is given based on historical data. Basic events are linked via logic symbols (gates) to one or more undesirable top events.
Computerized FTA
Small fault trees have fewer than 100 events, medium fault trees have from 100 to 1,000 events, and large fault trees have more than 1,000 events! Today, computerized FTA can be used to analyze very complex systems as well as very complex relationships between hardware, software, and humans. Using good FTA software, you can cut, copy, paste, rearrange, and delete events and gates to various fault tree branches to quickly and easily compare different hardware configurations. An example of a computer-generated fault tree follows.*
* Generated in Relex Fault Tree. Click the image to view a full-size version. In the above figure, "Passenger Injury Occurs in Elevator" is defined as the top event. The reasons why passenger injury in an elevator could occur have been determined to be either that the box free falls or that the door is open at an inappropriate time. After determining all possible causes for each event identified, the events and gates for connecting them to higher-level events are added to the fault tree. Any faults that can be further developed to determine causes are then added as lower-level events and connected by the appropriate gates. The lowest-level events that terminate fault tree paths are called basic events or primary events. They are either component-level events that cannot be further resolved or external events. For example, in the first level of possible events for the free fall of the box, "Cable off Pulley" and "Broken Cable" are basic events. Because these events are primary faults, they are not developed any further in the fault tree.
Using too wide of a scope for the top event, which results in a large, complex, and unfocused fault tree. Using inconsistent nomenclature for the same events, which prevents you from finding events that occur in multiple branches of the fault tree. Using the same nomenclature for similar but different components, thereby identifying the same failure for several scenarios when these failures are actually caused by different components. Breaking the fault tree into branches by electrical, mechanical, and structural subsystems, thereby failing to take the interface and integration of the system into account.
Top Event Definition Because the top event sets the tone for the series of questions that are considered when constructing the fault tree, the analyst should use the system definition to construct a clear and concise top event. If a top event is vaguely stated, the fault tree is likely to be large, complex, and unfocused. To generate a useful fault tree, the top event must be precisely stated and be narrow in scope. Specifying the specific mission phase or portion of the mission to which a top event applies in the description of the top event often helps to generate a very concise fault tree. Event Nomenclature During fault tree creation, consistently applying the appropriate nomenclature to events is critical to identifying the same event in multiple fault tree branches. If, for example, you give an event a different name in another branch of the fault tree, cutset analysis, which is described in the "Fault Tree Analysis" section, identifies multiple events leading to different failures (rather than the same event leading to different failures). If you do not realize that nomenclature errors exist, you may not recognize an event as a major contributor to the top event and thereby fail to recommend improvements or controls for it. Similarly, when two identical components are installed in different locations within a system, you must be sure to identify that they are physically different components by using reference designators in the nomenclature. Otherwise, cutset analysis identifies how the same component failure contributes to several scenarios when the failures are actually caused by different components. Branch Arrangement Because engineering groups so often function autonomously, fitting each piece of hardware together in a system tends to be an afterthought. Organizations that regularly categorize work by engineering disciplines tend to arrange the branches of a fault tree by subsystems. However, such an arrangement limits FTA to considering only component failures. When engineering groups fail to properly coordinate and implement a design as a team, interfaces and interactions are most often the areas in which the system breaks down. When fault tree branches are arranged by subsystems, these areas are never even addressed. When scenarios that lead to the top event are used to arrange fault tree branches, the analyst can place faults under the cause for a component failure. Causes can include not only hardware failures but also interface and integration problems due to design flaws, software, human errors, operation and maintenance errors, and environmental influences on the system. Fault trees arranged by scenarios often uncover complex relationships and interactions of systems, components, and actions that are believed to be unrelated. For example, such an FTA can reveal a single-point component failure that can fail two supposedly redundant or independent systems.
After properly identifying all failures, events, and conditions that can lead to the occurrence of the top event, you can compute the probability of the top event and measure the relative impact of a design fix. The traditional analysis process is to generate the system minimal cut sets, apply the basic event probabilistic data, and then determine the probability of the top event. The qualitative analysis of fault trees is based on determining the minimal cutsets for the top event. Cutsets identify the sets of events that cause the top event to occur. A cutset can be a single-point failure or event or can be a set of many events. Different cutsets can include different combinations of the same event. A minimal cutset is the smallest group of events that cause the top event to occur. In large trees, the events that cause the top event to occur are often buried deep within the system and are not easily discovered without performing cutset analysis. The basic events that belong to a cutset provide such information as single-point failures and the relative contributions of each cutset. Generally, the cutsets that have the highest probability of occurrence are the ones that have the fewest number of events. Thus, the minimal cutset information obtained during qualitative analysis can be used for computing the unavailability and unreliability values of the system during quantitative analysis. (Unavailable and unreliability values are calculated by FTA because fault trees are organized around system failures rather than system successes.) For quantitative analysis, reliability and maintainability information such as failure probability or repair rate is used to determine or quantify the probability of occurrence of the top event.
Conclusion
Because FTA is an event-oriented analysis, it can identify more possible failure causes than structureoriented FMEAs (Failure Modes and Effects Analysis) and RBDs (Reliability Block Diagrams), which allow only hardware failure considerations. When performed correctly, FTA often identifies system problems that other design and analytical methods would overlook.
Topics Covered
Fault Tree Definition Developing the Fault Tree Structural Significance of the Analysis Quantitative Significance of the Analysis Diagnostic Aids and Shortcuts Finding and Interpreting Cut Sets and Path Sets Success-Domain Counterpart Analysis Assembling the Fault Tree Analysis Report Fault Tree Analysis vs. Alternatives Fault Tree Shortcoming/Pitfalls/Abuses
All fault trees appearing in this training module have been drawn, analyzed, and printed using FaultrEaseTM, a computer application available from: Arthur D. Little, Inc./Acorn Park/ Cambridge, MA., 02140-2390 Phone (617) 8645770.
2
8671
Origins
Fault tree analysis was developed in 1962 for the U.S. Air Force by Bell Telephone Laboratories for use with the Minuteman systemwas later adopted and extensively applied by the Boeing Companyis one of many symbolic logic analytical techniques found in the operations research discipline.
4
8671
7
8671
Some Definitions
FAULT An abnormal undesirable state of a system or a system element* induced 1) by presence of an improper command or absence of a proper one, or 2) by a failure (see below). All failures cause faults; not all faults are caused by failures. A system which has been shut down by safety features has not faulted. FAILURE Loss, by a system or system element*, of functional integrity to perform as intended, e.g., relay contacts corrode and will not pass rated current closed, or the relay coil has burned out and will not close the contacts when commanded the relay has failed; a pressure vessel bursts the vessel fails. A protective device which functions as intended has not failed, e.g, a blown fuse.
8
8671
Definitions
PRIMARY (OR BASIC) FAILURE The failed element has seen no exposure to environmental or service stresses exceeding its ratings to perform. E.g., fatigue failure of a relay spring within its rated lifetime; leakage of a valve seal within its pressure rating. SECONDARY FAILURE Failure induced by exposure of the failed element to environmental and/or service stresses exceeding its intended ratings. E.g., the failed element has been improperly designed, or selected, or installed, or calibrated for the application; the failed element is overstressed/underqualified for its burden.
9
8671
Non-repairable system. No sabotage. Markov Fault rates are constant = 1/MTBF = K The future is independent of the past i.e., future states available to the system depend only upon its present state and pathways now available to it, not upon how it got where it is. Bernoulli Each system element analyzed has two, mutually exclusive states.
10
8671
OR
AND
Repeat/continue
NO
YES
Dont let gates feed gates.
13
8671
?
Air Escapes From Casing Tire Pressure Drops Tire Deflates
15
8671
MAYBE A gust of wind will come along and correct the skid. A sudden cloudburst will extinguish the ignition source. Therell be a power outage when the workers hand contacts the highvoltage conductor. No miracles!
TOP events represent potential high-penalty losses (i.e., high risk). Either severity of the outcome or frequency of occurrence can produce high risk.
17
8671
Scoping reduces effort spent in the analysis by confining it to relevant considerations. To scope, describe the level of penalty or the circumstances for which the event becomes intolerable use modifiers to narrow the event description.
18
8671
CAUSE (3) and, each element must be an immediate contributor to the level above
NOTE: As a group under an AND gate, and individually under an OR gate, contributing elements must be both necessary and sufficient to serve as immediate cause for the output event.
Example Fault Tree Development Constructing the logic Spotting/correcting some common errors Adding quantitative data
20
8671
Transport Failures
21
8671
* Partitioned aspects of system function, subdivided as the purpose, physical arrangement, or sequence of operation
No Start Pulse
Natural Apathy
Biorhythm Fails
22
8671
Verifying Logic
Oversleep
No Start Pulse
Biorhythm Fails
?
23
8671
Wakeup Succeeds
motivation
No Start Pulse
Failure Domain
Natural Apathy
Success Domain
BioRhythm Fails
BioRhythm Fails
?
24
8671
Nocturnal Deafness
Power Outage
Faulty Innards
Forget to Set
Faulty Mechanism
Forget to Set
Forget to Wind
Electrical Fault
Mechanical Fault
What does the tree tell up about system vulnerability at this point?
S = Successes F = Failures S Reliability R =(S+F) Failure Probability PF = F (S+F) S R + PF = (S+F)+ F 1 (S+F) = Fault Rate = 1 MTBF
27
8671
Significance of PF
Random Failure
T 0 0
Fault probability is modeled acceptably well as a function of exposure interval (T) by the exponential. For exposure intervals that are brief (T < 0.2 MTBF), PF is approximated within 2% by T.
PF T (within 2%, for T 20%) 1.0 t
= 1 / MTBF
(In B fa UR nt N M IN or ta lity )
Most system elements have fault rates ( = 1/MTBF) that are constant (0) over long periods of useful life. During these periods, faults occur at random times.
BU O RN UT
0.63 0.5
PF = 1 T = T
0 0
28
8671
1 MTBF
For 2 Inputs
AND Gate
T = A + B A B
PF = 1 T PF = 1 ( A + B A B) PF = 1 [(1 PA) + (1 PB) (1 PA)(1 PB)]
PF = PA + PB PA PB
for PA,B 0.2 PF PA + PB with error 11%
[Union / ]
PF = PA PB
[Intersection / ]
For 3 Inputs PF = PA PB PC
Omit for approximation
PF = PA + PB + PC
PA PB PA PC PB PC + PA PBPC
29
8671
TOP
PT = P1 P2
[Intersection / ]
OR Gate
PT Pe
TOP
PT P1+ P2
[Union / ]
1
P1
2
P2 1&2 are INDEPENDENT events.
1
P1
2
P2
PT = P1 P2
30
8671
PT = P1 + P2 P1 P2
Usually negligible
TOP
PT = ?
Success
TOP
PT = (1 Pe)
Failure
TOP
PT =
1
P1
2
P2
3
P3
3
P3 = (1 P3)
1
P1
2
P2
P1 = (1 P1)
The ip operator ( ) is the P2 = (1 P2) co-function of pi (). It PT = Pe= 1 (1 Pe) provides an exact solution for propagating PT = 1 [(1 P1) ( 1 P2) (1 P3 (1 Pn )] probabilities through the OR gate. Its use is rarely justifiable.
31
8671
Pe
3
P3
32
8671
33
8671
0.04 0.05
0.07
0.1
PU
Upper Log PL + Log PU Log Average = Antilog = Antilog (2) + (1) = 101.5 = 0.0316228 Probability 2 2 Bound 101
Note that, for the example shown, the arithmetic average would be 0.01 + 0.1 = 0.055 2 i.e., 5.5 times the lower bound and 0.55 times the upper bound
* Reference: Briscoe, Glen J.; Risk Management Guide; System Safety Development Center; SSDC-11; DOE 76-45/11; September 1982. 35
8671
Source: Willie Hammer, Handbook of System and Product Safety, Prentice Hall
Sources: * WASH-1400 (NUREG-75/014); Reactor Safety Study An Assessment of Accident Risks in U.S. Commercial Nuclear Power Plants, 1975 **NUREG/CR-1278; Handbook of Human Reliability Analysis with Emphasis on 38 Nuclear Power Plant Applications, 1980
8671
Power Outage
3. x 104
Forget to Set
8. x 103 2/1
Faulty Mechanism
4. x 104 1/10
Forget to Set
8. x 103 2/1
Forget to Wind
1. x 102 3/1
4. x 104 1/10
41
8671
Browning, R.L., The Loss Rate Concept in Safety Engineering * National Safety Council, Accident Facts Kopecek, J.T., Analytical Methods Applicable to Risk Assessment & Prevention, Tenth International System Safety Conference
Apply Scoping
What power outages are of concern?
Power Outage 1 X 102 3/1
Not all of them! Only those that Are undetected/uncompensated Occur during the hours of sleep Have sufficient duration to fault the system This probability must reflect these conditions!
42
8671
Single-Point Failure A failure of one independent element of a system which causes an immediate hazard to occur and/or causes the whole system to fail.
Professional Safety March 1980
43
8671
Freedom from single point failure: Redundancy ensures that either 1 or 2 may fail without inducing TOP.
44
8671
Do
Independent
Hand Falls Off Hand Jams Works Elect. Fault Hand Falls/ Jams Works Gearing Fails Other Mech. Fault
Alarm Failure
Alarm Failure
True Contributors
Alarm Clock Fails Toast Burns Backup Clock Fails Alarm Clock Fails Backup Clock Fails
45
8671
Microwave
ElectroOptical
Seismic Footfall
Acoustic
DETECTOR/ALARM FAILURES
47
8671
Four, wholly independent alarm systems are provided to detect and annunciate intrusion. No two of them share a common operating principle. Redundancy appears to be absolute. The AND gate to the TOP event seems appropriate. But, suppose the four systems share a single source of operating power, and that source fails, and there are no backup sources?
Detector/Alarm Failure
Here, power source failure has been recognized as an event which, if it occurs, will disable all four alarm systems. Power failure has been accounted for as a common cause event, leading to the TOP event through an OR gate. OTHER COMMON CAUSES SHOULD ALSO BE SEARCHED FOR.
48
8671
Dust/Grit Temperature Effects (Freezing/Overheat) Electromagnetic Disturbance Single Operator Oversight Many Others
Missing Elements?
Contributing elements must combine to satisfy all conditions essential to the TOP event. The logic criteria of necessity and sufficiency must be satisfied.
Unannunciated Intrusion by Burglar SYSTEM CHALLENGE
Detector/Alarm Failure
Intrusion By Burglar
Burglar Present
Barriers Fail
51
8671
10% of returnees are infected 90% are not infected 1% of infected cases test falsely negative, receive no treatment, succumb to disease
53
8671
2% of uninfected cases test falsely positive, receive treatment, succumb to side effects
Cut Sets
AIDS TO System Diagnosis Reducing Vulnerability Linking to Success Domain
54
8671
Cut Sets
A CUT SET is any group of fault tree initiators which, if all occur, will cause the TOP event to occur. A MINIMAL CUT SET is a least group of fault tree initiators which, if all occur, will cause the TOP event to occur.
55
8671
Ignore all tree elements except the initiators (leaves/basics). Starting immediately below the TOP event, assign a unique letter to each gate, and assign a unique number to each initiator. Proceeding stepwise from TOP event downward, construct a matrix using the letters and numbers. The letter representing the TOP event gate becomes the initial matrix entry. As the construction progresses: Replace the letter for each AND gate by the letter(s)/number(s) for all gates/initiators which are its inputs. Display these horizontally, in matrix rows. Replace the letter for each OR gate by the letter(s)/number(s) for all gates/initiators which are its inputs. Display these vertically, in matrix columns. Each newly formed OR gate replacement row must also contain all other entries found in the original parent row.
56
8671
A final matrix results, displaying only numbers representing initiators. Each row of this matrix is a Boolean Indicated Cut Set. By inspection, eliminate any row that contains all elements found in a lesser row. Also eliminate redundant elements within rows and rows that duplicate other rows. The rows that remain are Minimal Cut Sets.
57
8671
B 1 C 2
D 4
58
8671
B is an OR gate; 1 & C, its inputs, replace it vertically. Each requires a new row.
1 2 2 3 1 4
1 2 2 D 3 1 4
D (top row), is an OR gate; 2 & 4, its inputs, replace it vertically. Each requires a new row.
59
8671
1 2 2 2 3 1 4 2 4 3
Minimal Cut Set rows are least groups of initiators which will induce TOP.
TOP
represent this Fault Tree and this Fault Tree is a Logic Equivalent of the original, for which the Minimal Cut Sets were derived.
60
8671
6 TOP
Minimal cut sets 1/3/5 1/3/6 1/4/5 1/4/6 2/3/5 2/3/6 2/4/5 2/4/6
61
8671
1 3
1 3 6
1 4 5
1 4
2 3 5
2 3 6
2 4
2 4
TOP A
6
D F
3
E
5
G
Note that there are four Minimal Cut Sets. Co-existence of all of the initiators in any one of them will precipitate the TOP event.
1
64
8671
Blocks represent functions of system elements. Paths through them represent success.
Barring terms (n) denotes consideration of their success properties. C
3 5 4 6 1
3 1
1
D F
6 3
E
5
G
TOP The tree models a system fault, in failure domain. Let that fault be System Fails to Function as Intended. Its opposite, System Succeeds to function as intended, can be represented by a Reliability Block Diagram in which success flows through system element functions from left to right. Any path through the block diagram, not interrupted by a fault of an element, results in system success.
65
8671
3 2
B C
3 1
4 4
5 1 6
1
D F
6 3
E
5
G
1 1 1 3
2 3 4 4 5 6
66
8671
Each Cut Set (horizontal rows in the matrix) interrupts all left-to-right paths through the Reliability Block Diagram
Note that 3/5/1/6 is a Cut Set, but not a Minimal Cut Set. (It contains 1/3, a true Minimal Cut Set.)
1 1
C
2 3 4 4 5 6
1 3
6
1
D F
2
E
5
G
Pt P k = P 1 x P2 + P1 x P3 + P1 x P4 + P3 x P4 x P5 x P6
Note that propagating probabilities through an unpruned tree, i .e., using Boolean-Indicated Cut Sets rather than minimal Cut Sets, would produce a falsely high PT.
1 2 3 5 4 6 1 3 1 4 3 5
68
8671
Cut Set Probability (Pk), the product of probabilities for events within the Cut Set, is the probability that the Cut Set being considered will induce TOP. Pk = Pe = P1 x P2 x P3 xPn
1v
D F
Uniquely subscript initiators, using letter indicators of common cause susceptibility, e.g. l = location (code where) m = moisture h = human operator Minimal Cut Sets q = heat 1 v 2h f = cold 6m v = vibration 1v 3 m etc.
1v 4 m
2h
E
3m
5m
G
3m 4m 5m 6m
All Initiators in this Cut Set are vulnerable to moisture. Moisture is a Common Cause Some Initiators may be vulnerable to several Common Causes and receive several corresponding and can induce TOP. subscript designators. Some may have no Common ADVICE: Moisture proof one or more items. Cause vulnerability receive no subscripts. 69 3m 4m 4m 1v
8671
System Fault
These must be OR
Analyze as usual
70
8671
Introduce each Common Cause identified as a Cut Set Killer at its individual probability level of both (1) occurring, and (2) inducing all terms within the affected cut set.
1 1
C
2 3 4 4 5 6
1 3
6
1
D F
2
E
5
G
All other things being equal A LONG Cut Set signals low vulnerability A SHORT Cut Set signals higher vulnerability Presence of NUMEROUS Cut Sets signals high vulnerability and a singlet cut set signals a Potential Single-Point Failure.
PT
The quantitative importance of a Cut Set (Ik) is the numerical probability that, given that TOP has occurred, that Cut Set has induced it. Pk Ik = PT 6 where Pk = Pe = P3 x P4 x P5 x P6 Minimal Cut Sets
1
D F
2
E
5
G
1 1 1 3
2 3 4 4 5 6
Analyzing Quantitative Importance enables numerical ranking of contributions to System Failure. To reduce system vulnerability most effectively, attack Cut Sets having greater Importance. Generally, short Cut Sets have greater Importance, long Cut Sets have lesser Importance.
72
8671
Item Importance
The quantitative Importance of an item (Ie) is the numerical probability that, given that TOP has occurred, that item has contributed to it. Ne = Number of Minimal Cut Sets containing Item e Ne Ie Ike Minimal Cut Sets
1 1 1 3
73
8671
Ike = Importance of the Minimal Cuts Sets containing Item e Example Importance of item 1
2 3 4 4 5 6
I1
Path Sets
Aids to Further Diagnostic Measures Linking to Success Domain Trade/Cost Studies
74
8671
Path Sets
A PATH SET is a group of fault tree initiators which, if none of them occurs, will guarantee that the TOP event cannot occur. TO FIND PATH SETS* change all AND gates to OR gates and all OR gates to AND. Then proceed using matrix construction as for Cut Sets. Path Sets will be the result.
*This Cut Set-to-Path-Set conversion takes advantage of de Morgans duality theorem. Path Sets are complements of Cut Sets.
75
8671
1
D F
Path Sets are least groups of initiators which, if they cannot occur, guarantee against TOP 6 occurring
1
G
2
E
3 4 5 6 3 4
1 1 1 2
1 1 1 3
76
8671
2 3 4 4 5 6
3
B C
3 1
4 4
5 1 6
1
D F
6 3
E
5
G
1 1 1 1
3 4 5 6
77
8671
2 3 4 Path Sets
Each Path Set (horizontal rows in the matrix) represents a left-toright path through the Reliability Block Diagram.
Pp Pe
Path Set Probability (Pp) is the probability that the system will suffer a fault at one or more points along the operational route modeled by the path. To minimize failure probability, minimize path set probability.
a b c d e
78
8671
1 1 1 1 2
3 4 5 6 3 4
PPe
Sprinkle countermeasure resources amongst the Path Sets. Compute the probability decrement for each newly adjusted Path Set option. Pick the countermeasure ensemble(s) giving the most favorable Pp / $. (Selection results can be verified by computing PT/ $ for competing candidates.)
For all new countermeasures, THINK COST EFFECTIVENESS FEASIBILITY (incl. schedule)
AND
Does the new countermeasure Introduce new HAZARDS? Cripple the system?
79
8671
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
81
8671
10 11
16
17
22
23
24
Imagine a roulette wheel representing 9 8 each initiator. The peg count ratio for each wheel is determined by 13 14 12 15 probability for that initiator. Spin all initiator wheels once for each system exposure interval. Wheels winning in 20 18 19 gate-opening combinations provide a path to the TOP. 26 28 29 27
25
21
30
31
32
33
34
10
11
12
P10 = ?
16 17
22
23
24
25
Embedded within the tree, theres a bothersome initiator with 9 8 an uncertain Pe. Perform a crude sensitivity test to obtain quick relief from worry or, to justify the urgency of need for more exact input data: 13 14 15 1.Compute PT for a nominal value of Pe. Then, recompute PT 20 for a new Pe = Pe + Pe. 21 PT 18 19 now, compute the Sensitivity of Pe = Pe If this sensitivity exceeds 0.1 in a large tree, work to ~27 28 26 29 Find a value for Pe having less uncertaintyor 2.Compute PT for a value of Pe at its upper credible limit. Is the corresponding PT acceptable? If not, get a better Pe.
31 32 33 34
30
83
8671
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
84
8671
Probability
Where do you stop the analysis? The analysis is a Risk Management enterprise. The TOP statement gives severity. The tree analysis provides probability. ANALYZE 4 3 5 2 NO FURTHER DOWN THAN IS NECESSARY TO ENTER PROBABILITY DATA WITH CONFIDENCE. Is risk acceptable? If YES, stop. If NO, use the tree to guide risk reduction. SOME EXCEPTIONS 8 6 9 7 1.) An event within the tree has alarmingly high probability. Dig deeper beneath it to find the source(s) of the high probability. 10 11 12 13 14 15 2.) Mishap autopsies must sometimes analyze down to the cotter-pin level to produce a credible cause list.
16 17 18 19 20 21
85
8671
State-of-Component Method
Relay K-28 Contacts Fail Closed
WHEN Analysis has proceeded to the device level i.e., valves, pumps, switches, relays, etc. HOW Show device fault/failure in the mode needed for upward propagation.
Relay K-28 Secondary Fault
Install an OR gate. Place these three events beneath the OR. This represents faults from environmental and service stresses for which the device is not qualified e.g., component struck by foreign object, wrong component selection/installation. (Omit, if negligible.)
This represents internal self failures under normal environmental and service stresses e.g., coil burnout, spring failure, contacts drop off
86
8671
Analyze further to find the source of the fault condition, induced by presence/absence of external command signals. (Omit for most passive devices e.g., piping.)
Executive Summary (Abstract of complete report) Scope of the analysis Say what is analyzed
Brief system description and TOP Description/Severity Bounding what is not analyzed. Analysis Boundaries Interfaces Treated Physical Boundaries Resolution Limit Operational Boundaries Exposure Interval Operational Phases Others Human Operator In/out
The Analysis
Show Tree as Figure. Discussion of Method (Cite Refs.) Include Data Sources, Software Used Cut Sets, Path Sets, etc. Presentation/Discussion of the Tree as Tables. Source(s) of Probability Data (If quantified) Common Cause Search (If done) Sensitivity Test(s) (If conducted) Cut Sets (Structural and/or Quantitative Importance, if analyzed) Path Sets (If analyzed) Trade Studies (If Done)
Findings
TOP Probability (Give Confidence Limits) Comments on System Vulnerability Chief Contributors Candidate Reduction Approaches (If appropriate) Risk Comparisons (Bootstrapping data, if appropriate) Is further analysis needed? By what method(s)?
*Adapted from Fault Tree Analysis Application Guide, Reliability Analysis Center, Rome Air Development Center.
Closing Caveats
Be wary of the ILLUSION of SAFETY. Low probability does not mean that a mishap wont happen! THERE IS NO ABSOLUTE SAFETY! An enterprise is safe only to the degree that its risks are tolerable! Apply broad confidence limits to probabilities representing human performance! A large number of systems having low probabilities of failure means that A MISHAP WILL HAPPEN somewhere among them! P1 + P2+ P3+ P4 + ----------Pn 1 More
92
8671
Caveats Do you REALLY have enough data to justify QUANTITATIVE ANALYSIS? For 95% confidence
We must have no failures in
Assumptions:
I Stochastic
to give PF
and
System Behavior
I Constant I Constant
I Constant
Environmental Stresses
94
8671
Operation/ Outcome
3 1
8671
An Example Problem
P Pump
Klaxon K
Background/Problem A subgrade compartment containing important control equipment is protected against flooding by the system shown. Rising flood waters close float switch S, powering pump P from an uninterruptible power supply. A klaxon K is also sounded, alerting operators to perform manual bailing, B, should the pump fail. Either pumping or bailing will dewater the compartment effectively. Assume flooding has commenced, and analyze responses available to the dewatering system. Develop an event tree representing system responses Develop a reliability block diagram for the system Develop a fault tree fro the TOP event Failure to Dewater Simplifying Assumptions: Power is available full time. Treat only the four system components S, P, K, and B. Consider operator error as included within the bailing function, B.
S
8671
Example Problem
Pump Succeeds (1 PP)
Event Tree
Klaxon Succeeds (1 PK) Pump Fails PP [PP PP PS] [PP PP PS PKPP + PKPP PS
Bailing Succeeds (1 PB) [PP PP PS PKPP + PKPP PS PBPP + PBPP PS + PBPK PP PBPKPP PS ] Bailing Fails (PB) [PBPP PBPP PS PBPKPP + PBPKPP PS]
[PS]
PSuccess = 1 PS PKPP + PKPP PS PBPP + PBPP PS + PBPKPP PBPKPP PS PFailure = PS + PKPP PKPP PS + PBPP PBPP PS PBPKPP + PBPKPPPS
7
8671
PSuccess + PFailure = 1
Failure
Success
[1 PS PP + PP PS]
Klaxon K
Bailing B
8
8671
Fault Tree
Exact solution: PTOP = PS + PP PK PPPKPS + PBPP PBPPPS PBPKPP + PBPKPP PS Rare event approximation: PTOP = PS + PP PK + PPPB
Cut Sets Path Sets S/P S/K S P/K P/B
K Klaxon Fails
9
8671
Command Failure
Failure To Dewater
Response Failure
Bailing Fails
Success Failure A1 Success Failure B1 Success Failure B2 Success Failure C Success Failure B3 Success Failure A2 Success Failure D Success Failure
Failure A1-2
Failure A1
Failure A2
16
7*
3*
1*
26
12
5*
13 6 14
10
8671
11
8671
Bibliography
Selected references for further study Center for process Safety; Guidelines for Hazard Evaluation Procedures; 2nd Edition with Worked Examples; 1992 (461 pp); American Institute of Chemical Engineers Lees, Frank P.; Loss Prevention in the Process Industries; 1996 (1,316 pp second edition; three volumes) Henley, Ernerst J. and Hiromitsu Kumamoto; Reliability Engineering and Risk Assessment; 1981 (568 pp)
13
8671
Once the system failure and success states have been properly defined, the states are then combined through the tree branching logic to obtain the various accident sequences that are associated with the given initiating event. Figure 4.3 shows a graphical example of a system event tree: the initiating event is depicted by the initial horizontal line and the system states are then connected in a stepwise, branching fashion: system success and failure states have been denoted by S and F, respectively. The accident sequences that result from the tree structure are shown in the last column. Each branch yields one particular accident sequence; for example, IS1F2 denotes the accident sequence in which the initiating event (I) occurs, system 1 is called upon and succeeds (S1), and system 2 is called upon but fails to perform its defined function. For larger event trees, this stepwise branching would simply be continued. Note that the system states on a given branch of the event tree are conditional on the previous system states having occurred. With reference to the previous example, the success and failure of system 1 must be defined under the condition that the initiating event has occurred; likewise, in the upper branch of the tree corresponding to system 1 success, the success and failure of system 2 must be defined under the conditions that the initiating event has occurred and system 1 has succeeded. 4.3. Event tree evaluation Once the final event tree has been constructed, the final task is to compute the probabilities of system failure. Each event (branch) in the tree can be interpreted as the top event of a fault tree which allows the evaluation of the probability of the occurrence of such event; the value thus computed represents the conditional probability of the occurrence of the event, given that the events which precede on that sequence have occurred. In case of independent events, multiplication of the conditional probabilities for each branch in a sequence gives the probability of that sequence (Figure 4.4). In the case of structural dependencies, two approaches to accident sequence modelling are available. One approach is called event tree with boundary conditions and consists in decomposing the system so as to identify the supporting parts or functions upon which some components and systems are simultaneously dependent. The supporting parts thereby identify appear explicitly as system event tree headings, preceding the dependent protection systems and components. Since dependent parts are extracted and explicitly treated as boundary conditions in the event tree, this approach leads to large fault trees and relatively small event trees. For example, consider an initiating event which requires two systems, S1 and S2 to intervene and suppose that S1 needs the pumps of S2 to operate. Then, one could extract the common part and consider three systems: S1, S2*, which is the S2 system without the pumps common to S1, and S3, which is the pumps used by both S1 and S2 (Figure 4.5). Then, the dependencies are explicitly represented in the tree and the branching associated to S1 and S2* eliminated when S3 is not functioning. Thus, all the conditional probabilities are independent and the probability of the accident sequences can be computed by simple multiplication. This way of proceeding, thus, simplifies considerably the computations but it requires a great deal of expertise by the analyst. In fact, since system interactions and dependencies are treated primarily within the inductive logic of the event tree, those dependencies not recognized by the analyst may not be incorporated into the analysis. The second approach is called Fault-tree link. In this method, the dependencies from support systems or common parts are modeled in the fault trees and thus, at the level of the event trees the system are inserted without any care of their structural dependencies. For each sequence of the event tree, then, the fault trees of the composing events are linked in one, large fault tree which follows the logic depicted in the event tree and the large fault tree is then solved with the usual techniques to compute the probability of occurrence of that sequence. Figure 4.6 shows the previous example of Figure 4.5. Only systems S1 and S2 are explicited on the event tree without particular care to their dependence. If we now want to evaluate the probability of the sequence IS1S2, we build a fault tree whose top event occurs when the initiating event I, and the failure of both systems S1 and S2 occur. In place of the events S1 and S2 we can substitute their corresponding system fault
trees, thus obtaining a large fault tree which can be logically simplified (accounting for the existing dependencies) and evaluated so as to give the probability of the top event, i.e. the probability of the sequence of interest. With this method, the dependencies are properly treated even if the analyst was, a priori, unaware that the dependency existed. On the other hand, the resulting fault tree for an accident sequence may be rather large. In summary, in the event trees with boundary conditions all the significant dependencies among systems are explicitly represented in the event tree; the fault trees for the individual events are then simple and independent; the analyst must take great care in identifying all the existing dependencies. In the fault tree-link approach, dependencies are included in the fault trees for the various systems and thus they are not dependent; the accident sequence, linked fault tree is rather large and complex but all dependencies are treated automatically. Finally, in Figures 4.7 and 4.8 we report a simplified version of functional and system event trees for the case of a large break of a pipe in the primary cooling circuit of a nuclear reactor: it can easily be seen that for realistic systems the trees can become quite complicated
Flow interception
Tanks cooling
S1
S2
Seq 1 Seq 2 IS1S 2 IS1 S 2 IS1S 2 IS1 S 2
Seq 3 Seq 4
S2
S1
Seq 1 Seq 3
IS1S 2 IS1S 2
I
Seq 5 Figure 4.2: Functional dependences IS 2
Figure 4.3: Illustration of event tree branching [From Reactor Safety Study. U.S. Nuclear Regulatory Commission Rep. WASH-1400, NUREG 75/014 (October 1975)].
Initiating event
Failure state F1
Figure 4.4: Schematic of event tree shown with fault trees used to evaluate probabilities of different events
S3
S1
S2 *
Freq(Seq1)=f(EI)Pr(S3)Pr(S1)Pr(S2*)
EI
S2 S1 S2
Seq4
AND
S2 S1 S2 OR AND AND OR
S1
S2
Pump 1 fault
OR
Pump 1 fault
Pump 2 fault
Human error
Seq. RS No. 1 2 3 4 5 6 7 8 9 10
CO I
ECl
COR ECR
Remarks Core cooled Slow melt Core cooled Slow melt Melt Core cooled Slow melt Melt Melt Melt
f f f f f f f f f NA NA f f NA NA NA NA NA NA f NA f NA NA NA
Figure 4.7: Function event tree for a large break LOCA (Loss of Coolant Accident)
Figure 4.8: System event tree for a large LOCA (Loss of Coolant Accident)