MonteCarlo Simulation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 87

Comparision of methods and software

tools for availability assessment of


production systems

Astrid Hetland Vesteraas

Master of Science in Physics and Mathematics


Submission date: June 2008
Supervisor: Arvid Næss, MATH
Co-supervisor: Marit Saastad, Safetec
Ragnar Aarø, Safetec

Norwegian University of Science and Technology


Department of Mathematical Sciences
Problem Description
Availability and production availability are two central concepts when analysing if a production
system is dependable and optimized for profit. Such analyses are important when decisions are to
be made regarding implementation of new facilities or modifications on existing facilities.
Assesment of production availability can be done in several ways, through classical statistical
methods or through simulations by use of software tools.

All availability predictions involve uncertainty, regardless of method for assessment. Estimation of
the uncertainty related to the results, will give the results a more useful value for the decision
makers.

There exists several computer programs designed for availability or production availability
assesment. Two of these are MIRIAM Regina, distributed by CognIt, and Relex Architect,
distributed by Relex Scandinavia AB. There are also a number of statistical methods available,
including renewal and quasi-renewal processes.

The candidate shall


- Compare the functionality, methods and results of the two computer programs MIRIAM Regina
and Relex,
- Consider the uncertainty of the results from the two programs and compare these,
- Compare the results obtained by use of the computer programs with results
obtained through other statistical methods.

Assignment given: 28. January 2008


Supervisor: Arvid Næss, MATH
Abstract
This thesis presents and considers several dierent methods for computa-
tion of availability and production availability for production system. It is
assumed that the system handles ow of a uid. The thesis presents two
software programs for computation of reliability measures, MIRIAM Regina
and Relex Reliability Studio, and several analytical methods, among them
one especially adapted to computation of production availability. For the
methods not able to compute production availability, a method is presented
which makes it possible to estimate production availability from computa-
tion of availability.
Among the methods, Relex and three of the analytical computation
methods are made to compute availability of the system. The analyti-
cal methods considered are standard availability computation based on the
structure function of the system and the denitions of availability and com-
putation based on renewal and quasi renewal processes. Relex makes it
possible to compute availability both by simulation and, if the system is
simple enough, by analytical methods. The usefulness of the analytical
methods is to an extent limited by the assumptions laid on the system.
Relex makes it possible to take into account more features one would ex-
pect to have in a real life system, but for analytical methods to be employed
in the computations, the system must be quite simple.
Two methods especially made for computing production availability are
presented. These are the software program MIRIAM Regina, which com-
bines a sophisticated ow algorithm with Monte Carlo simulation, and a
method based on using Markov chains for computing the probability dis-
tribution for ow through subsystems of the system under consideration,
and then employing simple merging rules to compute the ow through the
entire system. These methods are both very exible, and makes it possible
to take into account many dierent aspects of a real life system.
The most important source of uncertainty in the results form a computa-
tion, lies in the relation between the real life system and the model system
computations are made on. A model system will always be signicantly
simplied. When choosing a computation method and interpreting results,
it is important to keep in mind all assumptions made regarding the system,
both explicitly when making the model, and implicit in the computation
method. Another source of uncertainty is uncertainty in the input data. A
method for propagation of uncertainty through computations is presented
and employed on some of the methods. For simulation, one will in addition
have the uncertainty due to simulation being an way of making a statis-
tical sample. The size of the sample, given by the number of simulation
iterations done, will decide the accuracy of the result.
ii
Contents
1 Introduction 1
1.1 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Modeling of repairable systems 3


2.1 Denitions and concepts . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Measures for availability . . . . . . . . . . . . . . . . 3
2.1.2 Production availability . . . . . . . . . . . . . . . . . 5
2.1.3 Other measures . . . . . . . . . . . . . . . . . . . . . 5
2.1.4 Modeling of repair . . . . . . . . . . . . . . . . . . . 6
2.2 Uncertainty in the results . . . . . . . . . . . . . . . . . . . 6
2.2.1 Model selection . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Propagation of uncertainty . . . . . . . . . . . . . . 8
2.2.3 Uncertainty due to the procedure . . . . . . . . . . . 11

3 Methods for availability calculations 13


3.1 Calculation of availability . . . . . . . . . . . . . . . . . . . 13
3.2 Calculation of production availability . . . . . . . . . . . . . 14
3.3 An analytical method for calculating production availability 15
3.3.1 The method used on an example . . . . . . . . . . . 17
3.4 %availability . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Computing production availability when availability is known 26
3.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5.2 Computation of variability . . . . . . . . . . . . . . . 31
3.6 Renewal processes and their application in reliability calcu-
lations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.7 Renewal processes and availability . . . . . . . . . . . . . . 37
3.7.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.8 Quasi-renewal processes . . . . . . . . . . . . . . . . . . . . 41
3.9 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

iii
4 Software programs for availability calculations 49
4.1 The Relex software . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.1 The principles of Relex . . . . . . . . . . . . . . . . . 49
4.1.2 Implementation of models in Relex RBD or OpSim . 50
4.1.3 Calculations available in Relex RBD and OpSim . . 51
4.1.4 Availability calculations . . . . . . . . . . . . . . . . 52
4.1.5 Algorithm for Monte Carlo simulation . . . . . . . . 53
4.1.6 Method for calculating steady state availability, fail-
ure frequency and MTBF . . . . . . . . . . . . . . . 54
4.1.7 Capacity calculations . . . . . . . . . . . . . . . . . . 54
4.1.8 Computation of condence intervals . . . . . . . . . 54
4.1.9 Comparison of Relex and other reliability softwares . 55
4.1.10 The example in Relex . . . . . . . . . . . . . . . . . 55
4.2 Description of the reliability software MIRIAM Regina . . . 59
4.2.1 General principles . . . . . . . . . . . . . . . . . . . 59
4.2.2 Implementation of a system in MIRIAM Regina . . . 59
4.2.3 The simulation algorithm . . . . . . . . . . . . . . . 61
4.2.4 Calculation of production availability . . . . . . . . . 61
4.2.5 Available calculations . . . . . . . . . . . . . . . . . . 61
4.2.6 Uncertainty measures . . . . . . . . . . . . . . . . . 62
4.2.7 Implementation of the example in MIRIAM Regina . 63

5 Comparison of methods and results 65


5.1 Comparison of Relex and MIRIAM Regina . . . . . . . . . . 65
5.1.1 Implementation of models . . . . . . . . . . . . . . . 65
5.1.2 Advantages and disadvantages of the two softwares . 66
5.1.3 Uncertainty in the computations . . . . . . . . . . . 67
5.2 Comparison of the analytical methods . . . . . . . . . . . . 68
5.3 Comparision of analytical and simulated results . . . . . . . 69

6 Conclusion 73
References 78

iv
Chapter 1

Introduction
This thesis concerns dierent methods for calculating availability and pro-
duction availability for production systems. Both analytical methods and
softwares for simulation will be considered.
The system type considered in this thesis is a production system where a
ow of some kind enters the system at one point, goes through some process
and leaves the system at another point. The starting point for this thesis
was considerations for production systems in the petroleum industries, with
liquid ows of oil, gas or water. In many cases, similar considerations may be
used for other kinds of ow, for example parts being produced in a factory.
What kind of ow goes through the system is not considered in this thesis,
but the underlying assumption is that of a liquid ow. In the whole thesis,
the calculated production availabilities are calculated as percent of full ow.
The focus of the thesis is on the presentation and comparison of the dif-
ferent methods for computation of availability and production availability.
The background for the problem formulation is the fact that several dier-
ent softwares for availability and production availability calculations exist
and are employed in the petroleum industries. It is of interest to be able to
compare results obtained from dierent softwares, and in some cases to be
able to compute measurements which are not automatically computed by
the software.
The softwares considered in this thesis are MIRIAM Regina, a software
distributed by CognIT and widely used in Norway, and the software package
Relex Reliability Studio 2007, distributed by Relex Scandinavia AB. In the
thesis, the softwares will be referd to as respectively MIRIAM and Relex.
The experience of the author of this thesis with both these softwares is
limited, and getting to know especially Relex, has been a part of the task
in the preparation of this thesis. The presentation of the softwares is as far
as possible based on documentation from the distributors of the software.
In addition to the softwares, several analytical methods for computing
availability and production availability are presented, as it may be of interest

1
to compare the results of analytical computations with the results from the
softwares, which are primarily based on simulation. A part of the thesis is
also a method for estimating production availability from computations of
availability. This is of interest primarily in the case when one does not have
access to a software or method meant for computing production availability.
In some cases, it might be easier to apply this method instead of employing
time and eort in implementing the system in another software or using
another method. In such cases it must be taken into consideration whether
the estimates are suciently accurate for their uses.

1.1 Abbreviations
DFR Decreasing Failure Rate
DFRA Decreasing Failure Rate in Average
FMECA Failure Modes, Eects and Criticality Analysis
IFR Increasing Failure Rate
IFRA Increasing Failure Rate in Average
iid independent, identically distributed
LCL Lower condence limit
MDT Mean Downtime
MooN M out of N (components)
MTBF Mean time between failures
MTTF Mean Time To Failure
MTTR Mean Time To Repair
NBU New Better than Used
NBUE New Better than Used in Expectation
NWU New Worse than Used
NWUE New Worse than Used in Expectation
RBD Reliability block diagram
UCL Upper condence limit

1.2 Thanks
The problem formulation for this thesis was formulated in cooperation with
Safetec Nordic A/S, and all software used during the process was made
available by them.
I wish to thank all the sta at the Trondheim oce of Safetec, and
especially Marit Saastad and Ragnar Aarø, who have helped and supported
me in many ways during the writing of the thesis. I also wish to thank
Arvid Næss at NTNU.

2
Chapter 2

Modeling of repairable systems


This thesis focuses on repairable systems, as availability primarily is a rele-
vant measure for systems subject to downtime and repair. If a system can-
not be repaired, only the reliability of the system is relevant unless there
are other causes for downtime of the system which can not be quantied
in advance. If the amount of downtime is known, the calculations are very
simple.

2.1 Denitions and concepts


2.1.1 Measures for availability
There are several dierent concepts that are covered by the term availability.
The basic denition used in the standard NORSOK Z-016 [16] is

The ability of an item to be in a state to perform a required func-


tion under given conditions at a given instant of time or during
a given time interval, assuming that the required external re-
sources are provided. This ability is expressed as the proportion
of time(s) the item is in the functioning state.
Note 1: This ability depends on the combined aspects of the
reliability, the maintainability and the maintenance supporta-
bility.
Note 2: Required external resources, other than maintenance
resources do not aect the availability of the item.

The basic mathematical denition of availability is [11]

A(t) = P (X(t) = 1) (2.1)

3
where X(t) is the structure function of the system in question. The struc-
ture function is such that
(
1 if the system is functioning
X(t) = (2.2)
0 otherwise

This expression for the availability gives the availability at a specic time.
However, in many cases it will not be possible to determine this function.
Three other ways of calculating availability are of interest. These are

• interval or mission availability

• long run average availability

• limiting availability

The interval or mission availability is the average availability over a given


interval of time, often adjusted to the mission time for the system being
considered. This is calculated as
Z t2
1
Aav (t1 , t2 ) = A(t)dt (2.3)
t2 − t1 t1

and can be regarded as the average amount of time the system is functioning
during the interval in question, [11].
The long run average availability is the limit of the interval availability
when the lower time limit is 0 and the upper time limit goes to innity, that
is
1 t
Z
Aav = lim A(t)dt (2.4)
t→∞ t 0

In some cases the availability of a system will reach a steady state after
the system has been in operation for a time. In those cases the limiting
availability is of use. It is dened as

A = lim A(t) (2.5)


t→∞

if the limit exists. All these denitions can be found in [11]. The limit-
ing availability is of interest as for many systems, the availability quickly
converges toward the limiting availability.
Availability measures the up time of the system as a whole. If any ow
at all goes through the system, the system will be available. However, as
ow through the system may vary signicantly while the system is available,
other measures are needed to accurately estimate the produced amount.

4
2.1.2 Production availability
Production availability is a concept related to availability. It is dened in
[16] as
The ratio of production to planned production, or any other
reference level, over a specied period of time.
Note 1: This measure is used in connection with analysis of
delimited systems without compensating elements such as sub-
stitution from other producers and downstream buer storage.
Battery limits need to be dened in each case.
Note 2: The term injection availability may be used meaning
the ratio of injection volume to planned injection volume.
Expressed mathematically, this is
Pproduced
Aprod = (2.6)
Preference
Where Aprod denotes the production availability.
Production availability is found by looking at the availability of the
system together with the capacity of the stages in the system. The estimated
produced amount is then compared to the reference level, which often is the
wanted or ideal production.

2.1.3 Other measures


Possible measures for the performance of the system with regard to meeting
standards for production are operational availability and deliverability. The
concepts are dened below, but neither are further considered in this thesis.
Operational availability is dened in [11] as
The mean proportion of a mission period the item is able to
perform its intended function.
It is found by estimating the mean planned and unplanned downtime of a
system during its mission time.
Deliverability is dened in [16] as
The ratio of deliveries to planned deliveries over a specied pe-
riod of time, when the eect of compensating elements such as
substitution from other producers and downstream buer stor-
age is included.
Deliverability is closely related to production availability, but concerns the
amount which can be delivered and not the amount which can be produced,
and so does not necessarily depend only on one production system. Deliv-
erability is of concern when an amount is to be delivered to a customer, and
is important for good customer relations.

5
2.1.4 Modeling of repair
When availability is to be calculated, repair must be modeled. The most
common method is to assume that repair is perfect, that is, a component
is as good as new after it is repaired. This is however not an assumption
that ts well with the real world. In reality, a component can be close to as
good as new, as bad as it was immediately before the failure and resulting
repair, or anywhere in between. A component can even be in worse state
after a repair, if mistakes were done by the person
In some cases, both repair and replacement of components can be done,
and dierent assumptions can be assigned to the dierent maintenance ac-
tions.
Imperfect repair can be modeled in dierent ways. The most common
is the (p, q) rule of imperfect repair.

The (p, q)-rule for imperfect repair


This method of modeling imperfect repair is easy to implement. When a
repair is performed, it is assumed to be perfect with probability p and bad
as old with probability q . If is is bad as old, the next time to failure is
drawn from the tail of the probability distribution.
The method can be used for modeling repair with an arbitrary improve-
ment of the component. The expected improvement from a repair will be
somewhere between bad as old and good as new, depending on the value
chosen for p.
For the exponential distribution, imperfect repair cannot be modeled, as
this distribution is without memory, that is, for the exponential distribution,

P (X > x|X > x0 ) = P (X > x) (2.7)

where 0 ≤ x0 < x. Thus, drawing a number from the tail of the distribution
is the same as drawing from the distribution as a whole. This fact also shows
as the exponential distribution has constant failure rate, see [11]

2.2 Uncertainty in the results


2.2.1 Model selection
Perhaps the largest error source in modeling of systems is the assumptions
made when developing a model for the system under consideration. When
creating a mathematical model of a real life system, simplications must be
made. In many cases it is impossible to calculate results unless one makes
drastic simplications. All such simplications enlarges the divergence be-
tween the model and the real life system, and must be considered with the

6
utmost care. When interpreting results, the assumptions made must be
considered.
In [4], 18 factors which are often ignored in models of repairable systems
are listed. Some of these factors can be put into categories as given below.
Other factors are not mentioned here, as they primarily concern reliability
calculations. For example, there is little meaning in assuming negligeble
repair time when availability is to be calculated.

Independence of components
Perhaps the most common and most simplifying assumption made, is that
of independence of components. To a certain extent, this is a plausible as-
sumption, as each component in a system in many ways work on its own.
On the other hand, there are many circumstances which may cause simul-
taneous failure of many components in a system, when several components
depend on the same supporting system or are subject to the same envi-
ronmental stress. Examples can be a failure in the electrical system or a
lightning strike. These eects, known as common cause failures, are not
taken into account when components are assumed to be independent. Such
failures may also cause normally redundant components to fail simultane-
ously. Also, failure of one component may cause other components to fail.

Perfect repair
It is very common to assume that repair restores the system to a good as
new state, again an assumption which is seldom true in the real world. A
repair is usually performed by a person, who may make mistakes. A repair
on a real system may rend the system good as new, as bad as it was prior
to the repair or anywhere in between. It may even be worse than before the
repair, or even be broken, if mistakes were made.
A repair may be incomplete, may cause damage or improve the system
beyond what was expected. New parts may have been damaged before they
are put into use, and so the repair using such parts will not be perfect. Also,
parts may be replaced prior to failure if other repairs are done which make
additional repairs easy to do.
An important aspect is that a preventive maintenance action may inu-
ence the steady state results. If maintenance are a part of the model, using
steady state results to do calculations on the system may not be a good
approximation. It may also be that the steady state implies a reliability or
availability which is lower than the required level for the system, and the
system must then be overhauled before it reaches the steady state.

7
Human error not taken into account
Failures due to human error are very often ignored when system reliability
and availability are considered. This is at least partly due to the diculty
of assessing the probability of such errors. Human error may inuence the
day to day operation of the system, but is more likely to occur during repair
or maintenance, and especially in unusual operating conditions, such as a
when dangerous faults occur. This may inuence the extent of damage
when a failure occurs, and through this the time to repair.

Assumptions on the distributions of time to failure


Assumptions mentioned in [4] as being not well considered, are some which
are often made without much thought. For example, the components of a
system are mostly assumed to have the same failure and repair distributions
during the whole time the calculations are done. In reality, they may have
dierent distributions under dierent operating conditions.
Another problem, not mentioned in [4], is the very common assumption
of exponential lifetime for components. This assumption gives a constant
failure rate, when in practice one will most often observe that a system
deteriorates as it ages. The exponential distribution does however make
a good assumption if outside causes are assumed to be the predominating
cause of failure. The exponential distribution is very easy to work with.
This is probably the cause of its widespread use.
Simulation makes it possible to handle at least some of the above factors,
but as the amount of simplication is reduced, implementation may become
dicult and simulation time may increase. It is therefore limited how close
to the real world situation it is possible to come. However computations
are done, it is important to keep track of the assumptions made, and state
these plainly together with the results.

2.2.2 Propagation of uncertainty


Uncertainty in the input data is also a major concern in many cases, as
input data often is based on either prior observations or statements from
manufactureres of components. Such input data will be subject to uncer-
tainty.
When uncertain data is put into a model, the results will be aected
by these uncertainties. This is called propagation of uncertainty. There are
several methods for propagating uncertainty in input data to the results.
Some of these methods are presented in [10].
The uncertainty is expressed as the product of the stability and the
variation, as argued in [10]. If y is the output value, and a function of the
input value x, the sensitivity of the result can be measured as the derivative

8
of y with respect to x evaluated for a baseline value x0 . The variance of the
output can then be estimated as

 2
∂y
Var(y) ≈ Var(x) (2.8)
∂x x0

Both analytical methods and methods for simulated models exist. For
analytical methods, one of the most common methods is Taylor series ap-
proximation.
Let X = [xi ]ni=1 be a vector of input values, and let y = f (X). Let the
baseline values be equal to the expectation of the input values, so X 0 =
[x0i ]ni=1 = E(X) and y 0 = f (X 0 ). Then the Taylor series expansion of the
distance between the output and the expected output is

n   n n  2 
X ∂y 1 XX ∂ y
y−y 0 0 = (xi −x0i ) + (xi −x0i )(xj −x0j ) +. . .
∂xi X 0 2 ∂xi ∂xj X 0
i=1 i=1 j=1
(2.9)
Close to X 0 the higher powers of the series will be small. One can then use
only two terms of the expansion and get a good approximation.

n  
0
X ∂y
E(y − y ) ≈ E(xi − x0i )
∂xi X0
i=1
n Xn
∂2y
 
1 X
+ E[(xi − x0i )(xj − x0j )] (2.10)
2 ∂xi ∂xj X0
i=1 j=1
n X n
∂2y
 
1 X
= Covar(xi , xj )
2 ∂xi ∂xj X0
i=1 j=1

As x0i is theh expected


i value of xi , the expected value of xi − x0i is 0. The
∂y
expression ∂x i 0
denotes the value of the derivative when evaluated at
X
X 0.
It is also common to use rst order approximation, for which the result
for the expected value of the deviation y − y0 is approximately 0. One can

9
then compute an approximation to the variance of y − y0 :

Var(y) = E[(y − y 0 )2 ]
 
n   !2
X ∂y
≈ E (xi − x0i ) 
∂xi X 0
i=1
n n    
XX
0 0 ∂y ∂y
= E[(xi − xi )(xj − xj )]
∂xi X 0 ∂xj X 0
i=1 j=1
Xn X n 
∂y
 
∂y
 (2.11)
= Covar(xi , xj )
∂xi X 0 ∂xj X 0
i=1 j=1
n
∂y 2
X  
= Var(xi )
∂xi X 0
i=1
n X n    
X ∂y ∂y
+2 Covar(xi , xj )
∂xi X 0 ∂xj X 0
i=1 j=i+1

If the input values are assumed to be independent or uncorrelated, one is


left with only the variance terms.

Methods for Monte Carlo simulation


When using Monte Carlo simulation, the expectation and the variance of
the output can be estimated from the output values of the simulation runs,
but it is important to do enough simulation runs. To compute how many
simulation runs are needed to obtain a certain width of the condence in-
tervals computed, one can estimate the variance by doing a few simulation
iterations and then use the formula for the condence interval based on the
central limit theorem to compute the needed number of simulations.
The central limit can be found in any basic book on statistics, for ex-
ample [13]. The theorem states that the statistic

ȳ − µ
Z= (2.12)
√σ
n

where ȳ is the mean of a random sample from a population with mean


value µ and variance σ 2 is distributed according to the standard normal
distribution when n → ∞. This can be used to compute a condence
interval for the mean. A 1 − α% condence interval is then given by
 
σ σ
ȳ − z α2 √ , ȳ + z α2 √ (2.13)
n n

10
and the width of the condence interval is found by subtracting the lower
limit from the upper limit. This gives
 
σ σ
W = ȳ + z 2 √ − ȳ − z 2 √
α α
n n
(2.14)
σ
= 2z α2 √
n
For a given width W , the equation can be solved for n. The result is
 α 2  α 2
z2 z
n=4 2
σ = 4 2 Var(y) (2.15)
W W
There are also methods for computing uncertainty importance for Monte
Carlo models, but these need to be implemented in the simulation software,
as it is necessary to know all input values, and these are drawn as the
simulation progresses and usually not stored as this would require very
much storage capacity.
One possible measure is the sample correlation. Let m be the number
of runs, and let xk be a single input value in run number k and yk be the
corresponding output. The correlation between these can then be computed
as Pm
(xk − x̄)(yk − ȳ)
U (x, y) = pPm k1 Pm (2.16)
2 2
k=1 (xk − x̄) × k=1 (yk − ȳ)
This can be computed for each input value. This assumes that each simula-
tion run has the same number of input values, which is not the case for many
simulation methods. One could for example do the calculations for the rst
input drawn for each item being simulated. To do the calculations at all, it
will be necessary to implement the method in the simulation methodology.
If the method is not implemented when the simulation is done, it cannot
be employed.

2.2.3 Uncertainty due to the procedure


In simulation, the error will depend on the number of simulations made.
This is due to the fact that simulation is a way of creating a random sample.
The size of the sample will be determined by the number of simulation
iterations, and is important to the accuracy of the computations made using
the data.
The number of simulations will most often depend on cost of simulation
and available time. Assumptions for the model will of course also give rise
to uncertainty.
For analytical methods the errors arising will depend on whether the
assumptions for the methods and models are good enough and t the real
life situation.

11
The assumptions in an analytical model will be fashioned to come close
to a real life system, but will to a large degree depend on what can be
calculated. For large systems, analytical calculations can be prone to round
o error.

12
Chapter 3

Methods for availability


calculations
3.1 Calculation of availability
When it is assumed that the components of a system are independent,
availability of the system can be calculated from the structure function.
The structure function of a system is found by applying the simple rules for
parallel and series structures. For a series structure, the system will fail if
one of the components fail, so all components must be functioning for the
system to function. The structure function of a series structure consisting
of n components with
(
1 if the component is functioning
xi = (3.1)
0 otherwise

is
n
Y
X= xi (3.2)
i=1
For a parallel structure, the system will function as long as one or more
components are functioning. The structure function is given by
n
Y
X =1− (1 − xi ) (3.3)
i=1

These equations can be combined when a system consists of both parallel


and series structures. If the system is represented by a reliability block
diagram, it will in most cases be possible to use these equations directly. For
some structures, it will be easier to nd the structure function by employing
pivotal decomposition or other methods which enable division of the system
into more easily manageable parts, see [11].

13
The structure function can also be found from the cut sets of a system.
A cut set is a set of components so that if all components of the cut set are
failed, the system has failed. For the system shown in gure 3.1, the cut
sets are {1}, {2, 3, 4}, {5}, {6, 7}. The structure function can then be made
by considering the components in the cut sets to be in parallel, and the cut
sets themselves to be in series, as all cut sets must function, but only one
component in each cut set is needed for the system to function.
Availability for the components can be calculated in many ways, the
most common of which is to compute it as
MTTF
ai = (3.4)
MTTF + MTTR
which gives the mean availability. However, many other methods may be
used.
When analytical methods are used, it is assumed that the result is accu-
rate. Usually, no condence interval or similar measure is computed. The
accuracy of the result will depend on the degree to which the assumptions
of the method agree with the real world situation. This may lead to grave
errors if not handled with care, as many assumptions made in analytical
methods are completely implausible in the real world, as discussed previ-
ously. Another source of error in analytical calculations is inaccuracy in the
input data. This has also been discussed in previous sections.
Most methods presented in this thesis assume independent components.
If this assumption is to be avoided, possible methods are Markov chains
and simulation. Later sections will present both of these options.

3.2 Calculation of production availability


If it is assumed that the system can either produce at full capacity or
not produce at all, production availability can be found by multiplying the
mission availability with the production capacity of the system and dividing
it by planned production. In many cases, the problem will be far more
complex, as the production capacity can be reduced by an accident, and
the reduction in capacity can vary with which component or combination
of components that are not functioning. The problem is the calculation of
the ow through the network, and the calculation of amounts of time the
system is in dierent states.
As analytical methods have limitations when production availability is
to be calculated, simulation is often used. Next event Monte Carlo simula-
tion is especially useful in this context, as it can be adapted to almost any
system. For a complex system, the ow calculation will be a challenge, but
there are algorithms for such problems, see for example [6] where maximum
ow algorithms are presented. MIRIAM Regina is an example of a software

14
combining next event Monte Carlo simulation with a ow algorithm for
calculation of production availability. Simulation is more complex than an-
alytical calculation if the simulation algorithm must be implemented before
simulation can start. If a software with the simulation algorithm already
implemented is available, simulation may be less complex than analytical
calculations.

3.3 An analytical method for calculating produc-


tion availability
In [9] an analytical method for calculating production availability is de-
scribed. The method is based on nding the probability distribution for
the ow through the system by for example using Markov chains to nd
the probability distribution for subsystems and then applying some simple
rules for merging the results of the subsystems. The merging rules can be
applied regardless of how the probability distribution for the subsystems
was found.
If Markov chains are the preferred method, the calculation proceeds as
follows: The system under consideration is divided into subsystems. For
each subsystem, all possible states must be listed, and a Markov chain
established, se

2
6

1 3 5

Figure 3.1: The example system

According to the article [9], one of the advantages of the analytical meth-
ods is ease of calculation, but the advantage of simulation is the possibility
of incorporating complex maintenance and repair strategies in the model.
The reason for using Markov modeling in the method proposed in the arti-
cle is to make it possible to incorporate such strategies into the analytical

15
Subsytem 2
Subsytem 4
2
Subsytem 1 Subsytem 3 6

1 3 5

Figure 3.2: The example system with subsystems shown as red frames

model. Markov chains can be adapted to almost any degree of complexity,


though the number of possible states for a system will increase rapidly. To
an extent this can be countermanded by splitting the system up into more
subsystems. The limitations for the method consists in the amount of work
and computation time available.
The method consists of rst dividing the system into subsystems and
analyzing the subsystems to nd the dierent states it can be in. Each state
of the subsystem will be associated with a production volume decided by
the capacities of the components that are functioning. Markov modeling is
then used to establish a probability distribution for the production capacity.
When this is done for all subsystems, a set of simple rules is used to merge
the probability distribution for the dierent subsystems. The merging rules
are as follows:
For two subsystems connected in series, with discrete probability distri-
bution for the production volume: Let X be the volume produced by the
rst subsystem, and Y be the volume produced by the second subsystem,
and let Z denote the produced volume of the merged subsystem. Then

P (Z = z) = P (X = z)P (Y ≥ z) + P (Y = z)P (X > z) (3.5)

If the probability distributions are continuous, let X f (x), Y g(y) and


Z h(z). The rule is
Z ymax Z xmax
h(z) = f (z) g(y)dy + g(z) f (x)dx (3.6)
z z

For two subsystems connected in parallel, with discrete probability dis-

16
tributions, the rule is
X
P (Z = z) = P (X + Y = z) = P (X = x)P (Y = y) (3.7)
x,y;x+y=z

and with continuous probability distribution,


Z z
h(z) = f (x)g(z − x)dx (3.8)
xmin

3.3.1 The method used on an example


The example system is shown in gure 3.1 above. The subsystem division
is shown in gure 3.2, and data for the system is shown in table 3.1.

Component MTTF λ MTTR µ Availability Capacity


(hours) (hours−1 ) (hours) (hours−1 ) %
1 500 0.002 3 0.33 0.994 100
2 600 0.00167 7 0.143 0.988 40
3 400 0.0025 4 0.25 0.99 60
4 700 0.00143 2 0.5 0.997 50
5 600 0.00167 9 0.11 0.985 100
6 500 0.002 7 0.143 0.986 50
7 600 0.00167 4 0.25 0.993 60

Table 3.1: Data for the system in the example

All subsystems in this example have a nite number of possible states,


and the probability distributions for the ow are then discrete. For subsys-
tem 1, which only contains one component, there are only two states, failed
or in order. The probability distribution is thus easily determined. Either,
the component is working, and 100% ow goes through the subsystem, or
the component is failed and no ow can pass. The probability of 100% ow
is the same as the probability of the component being in a functioning state.
The states of the simple systems are given in table 3.2, and the transition
diagram is shown in gure 3.4.

State No. Description Capacity


1 Component working 100
2 Component failed 0

Table 3.2: The possible states of subsystems number 1 and 3

Subsystem 2 has three components and a number of possible states,


listed in table 3.3
Let λi and µi denote respectively the failure and repair frequencies of
component number i. To compute the probability distribution, Markov

17
State No. Description Capacity
1 All components working 100
2 Comp. 2 and 3 working, 4 failed 100
3 Comp. 3 and 4 working, 2 failed 100
4 Comp. 2 and 4 working, 3 failed 90
5 Comp. 2 working, 3 and 4 failed 40
6 Comp. 3 working, 2 and 4 failed 60
7 Comp. 4 working, 2 and 3 failed 50
8 All failed 0

Table 3.3: The possible states of subsystem number 2

chains are employed. Basic Markov chain theory can be found for example
in [11] and [12]. The transition diagram of the system is shown in gure
3.3. The transition matrix of the Markov chain is then
 
α1 λ3 λ1 λ2 0 0 0 0

 µ4 α2 0 0 λ3 λ2 0 0 


 µ2 0 α3 0 0 λ4 λ3 0 

µ3 0 0 α4 λ4 0 λ2 0
(3.9)
 
 

 0 µ3 0 µ4 α5 0 0 λ2 


 0 µ2 µ4 0 0 α6 0 λ3 

 0 0 µ3 µ2 0 0 α7 λ4 
0 0 0 0 µ2 µ3 µ4 α8
Where
−(λ2 + λ3 + λ4 )
   
α1

 α2  
  −(λ2 + λ3 + µ4 ) 


 α3  
  −(λ3 + λ4 + µ2 ) 

α4 −(λ2 + λ4 + µ3 )
(3.10)
   
 = 

 α5  
  −(µ3 + µ4 + λ2 ) 


 α6  
  −(µ2 + µ4 + λ3 ) 

 α7   −(µ2 + µ3 + λ4 ) 
α8 −(µ2 + µ3 + µ4 )
Subsystem number three again consists of only one component, and is
treated just like subsystem one. Subsystem 4 consists of two components
in parallel, and has four states, as described in table 3.4.
The transition diagram is shown in gure 3.5, and the transition matrix
for subsystem 4 is
 
1 − (λ6 + λ7 ) λ7 λ6 0

 µ7 1 − (λ6 + µ7 ) 0 λ6  (3.11)

 µ6 0 1 − (λ7 + µ6 ) λ7 
0 µ6 µ7 1 − (µ6 + µ7 )

18
2 5

1 3 6 8

4 7

Figure 3.3: The transition diagram for subsystem 2

19
λ

Figure 3.4: The transition diagram for subsystems 1 and 4

20
1 2

3 4

Figure 3.5: The transition diagram for subsystem 3

21
State No. Description Capacity
1 Both components working 100
2 Comp. 6 working, 7 failed 50
3 Comp. 7 working, 6 failed 60
4 Both components failed 0

Table 3.4: The possible states of subsystem number 4

To calculate the probability distribution for the dierent states, one


must nd the limiting probabilities for the states. As the chains are all
irreducible and positive recurrent, the limiting probabilities exists and can
be found from the equations
X
πj = πi Pij
i
X (3.12)
πj = 1
j

where Pij denotes the transition probability from state i to state j . These
equations give rise to a system of linear equations, which can easily be
solved by mathematical software equipped for matrix computations. The
limiting probabilities can be interpreted as the amount of time the system
spends in the respective states. For subsystem 4, the equation is
   

−(λ6 + λ7 ) µ7 µ6 0 π1 0

 λ7 −(λ6 + µ7 ) 0 µ 6   π2
    0 
 =   (3.13)
 λ6 0 −(λ7 + µ6 ) µ7   π3   0 
1 1 1 1 π4 1

When numbers are inserted and the equation solved (using the statistical
software R which includes facilities for matrix computations), the solution
is found to be
9.7966 · 10−1
   
π1
 π2   6.5441 · 10−3 
 π3  =  1.3702 · 10−2 
    (3.14)
π4 9.1527 · 10−5

For the other subsystems, the calculations are done in exactly the same
manner. Subsystem 1, consisting of only one component has the limiting
probabilities
   
π1 0.9940
= (3.15)
π2 0.0060

22
Subsystem 2 gives the limiting probabilities
9.758788 · 10−1
   
π1
 π2   2.791013 · 10−3 
1.139663 · 10−2
   
 π3   
9.758788 · 10−3
   
 π4  
(3.16)

 π5  =  2.791013 · 10−5
   

   
 π6  
   3.259435 · 10−5 

 π7   1.139663 · 10−4 
π8 3.259435 · 10−7
and for subsystem 3, the result is
   
π1 0.9850
= (3.17)
π2 0.0150
With the results above, the probability distributions for the ow through
each subsystem is determined. Now the subsystems must be merged, using
the rules stated in section 3.3 above. First, subsystems one and two are
merged. The possible amounts of ow for subsystem 1 are 0% and 100%.
Subsystem two has the possibilities 0%, 40%, 50%, 60%, 90% and 100%.
(Subsystem two in reality has larger capacity, but will never recieve ow
greater than 100%, so this is ignored). The subsystems are connected in
series and have a nite number of possible amounts of ow, so the proba-
bility distribution of the ow is determined by equation (3.5). X is the ow
through the rst subsystem, Y is the ow through the second.

P (Z = 0) = P (X = 0)P (Y ≥ 0) + P (Y = 0)P (X > 0)


= 0.0060 · 1 + 3.2594 · 10−7 · 0.9940 (3.18)
= 6.0003 · 10−3

P (Z = 40) = P (X = 40)P (Y ≥ 40) + P (Y = 40)P (X > 40)


= 0 + 2.7910 · 10−5 · 0.9940 (3.19)
−5
= 2.7743 · 10

P (Z = 50) = P (X = 50)P (Y ≥ 50) + P (Y = 50)P (X > 50)


= 0 + 1.1397 · 10−4 · 0.9940 (3.20)
= 1.1329 · 10−4

P (Z = 60) = P (X = 60)P (Y ≥ 60) + P (Y = 60)P (X > 60)


= 0 + 3.2594 · 10−5 · 0.9940 (3.21)
−5
= 3.2398 · 10

23
P (Z = 90) = P (X = 90)P (Y ≥ 90) + P (Y = 90)P (X > 90)
= 0 + 9.7588 · 10−3 · 0.9940 (3.22)
−3
= 9.7002 · 10

P (Z = 100) = P (X = 100)P (Y ≥ 100) + P (Y = 100)P (X > 100)


= 0.994 · (9.7588 · 10−1 + 2.7910 · 10−3 + 1.1397 · 10−2 ) + 0
= 0.98413
(3.23)
To sum up, the probability distribution for the dierent amounts of ow for
the merged subsystem is given in table 3.5p, the probability distribution for
the dierent amounts of ow for the merged subsystem is given in table 3.5
Next, subsystems 3 and 4 are merged. The process is as described above,

Amount of ow Probability


0 6.0003 · 10−3
40 2.7743 · 10−5
50 1.1329 · 10−4
60 3.2398 · 10−5
90 9.7002 · 10−3
100 0.98413

Table 3.5: The probatility distribution for the ow through the merging of
subsystem 1 and 2

and the result is given in table 3.6. At last, the two merged systems are

Amount of ow Probability


0 0.01509
50 0.006446
60 0.01350
100 0.9650

Table 3.6: The probability distribution for the ow through the merging of
subsystem 3 and 4

merged, and the result for the whole system is obtained. It is given in table
3.7.
The production availability of the system then becomes
Aproduction =40 · 2.7324 · 10−5 + 50 · 6.5180 · 10−3 + 60 · 1.3448 · 10−2 + 90 · 9.3607 · 10−3
+ 100 · 0.94969 = 96.94533
(3.24)

24
Amount of ow Probability
0 2.1000 · 10−2
40 2.7324 · 10−5
50 6.5180 · 10−3
60 1.3448 · 10−2
90 9.3607 · 10−3
100 0.94969

Table 3.7: The probability distribution for the ow through the whole sys-
tem.

If one compares this result to the result obtained in the section above,
the results are quite similar. The availability of the system can be computed
as the amount of time the ow is larger than 0, that is

1 − 2.1 · 10−2 = 0.979 (3.25)

3.4 %availability
An alternative and less accurate approach to estimating the produced amount
from a system is to compute the amount of time when a given percentage of
the demanded ow is available. This gives an idea of how much dierence
there is between the availability of a system and the production availability.
For systems with parallel structures, dierent congurations are possi-
ble. If a system has three parallel streams, one, two or all three may have
to work for the system to be functioning. Generally, such congurations are
denoted by the expression M ooN , interpreted as M out of N components
must function for the system to function. A system with tree components
in parallel can thus have the congurations 1oo3, 2oo3 or 3oo3. Availability
for the system can be computed for all such congurations.
If considering the system of the example above, shown in gure 3.1, with
data given in table 3.1, it is easily seen that to be sure of having 100 percent
ow passing through the system, all components must be operational. 90
percent ow is ensured if the rst parallel has conguration 2oo3 and the
second conguration 3oo3. If one has conguration 1oo3 and 1oo2, 40
percent ow is ensured, but 50 or 60 percent is more likely. These dierent
availabilities can be computed by applying the structure function of the
system. The structure function for the system with conguration 3oo3 and
2oo2 respectively for subsystem 2 and 4, is given by

A = a1 a2 a3 a4 a5 a6 a7 (3.26)

as this is the same as requiring all components to be working.

25
For conguration 2oo3 for subsystem 2 and 2oo2 for subsystem 4, the
availability is given by

A = a1 (1 − (1 − a2 a3 )(1 − a2 a4 )(1 − a3 a4 ))a5 a6 a7 (3.27)

and for congurations 2oo3 and 1oo2, the equation is

A = a1 (1 − (1 − a2 a3 )(1 − a2 a4 )(1 − a3 a4 ))a5 (1 − (1 − a6 )(1 − a7 )) (3.28)

When numbers are inserted, the probabilities are found to be as given in


table 3.8. It is clear that 100% ow is available most of the time, but as
seen, approximatly 3.5% of the time when the system is available, it does
not produce full ow.

Percent available Availability


40 0.97899
50 0.97898989
90 0.95862
100 0.93484

Table 3.8: The percent availabilities for the example system

3.5 Computing production availability when avail-


ability is known
In this thesis, it is of interest to compare results obtained from dierent
methods. One of these methods, the reliability software MIRIAM Regina,
calculates production availability. Another method, the software Relex re-
liability studio, computes availability. It will also be of interest to compare
the results with results obtained from analytical methods. It is thus of
interest to be able to compare production availability and availability.
For a simple series structure, the production availability can easily be
calculated from the availability by multiplying the computed availability
with the capacity of the structure. The capacity of a series structure is
decided by the component with the lowest capacity. For more complex
structures, the problem of calculating the production availability quickly
gets far more involved.
When computing availability, one assumes that the system is either
working or not working. In computing production availability, degraded
states of the system are considered if they are present. One can however
compare the amount of time with no production with the general unavail-
ability calculation. These should be comparable, as no ow through the
system is the same as the system being unavailable, provided that there

26
is no way there can be ow going through the system when availability
calculations consider the system to be failed.
If a more accurate calculation is needed, the system must be considered
in more detail. Diculties arise when there are several alternate routes
for the production ow with dierent capacities. It is then necessary to
determine the average amount of time when each of these routes are being
used.
When computing the availability of the system with conguration 1oo3,
the result is the amount of time when 0, 1 or 2 streams are failed, that
is, 1, 2 or 3 streams are working. Using conguration 2oo3 gives the time
when 2 or 3 stream are working, and conguration 3oo3 gives the time
when 0 streams have failed, and consequently all streams are working. By
calculating all these availabilities and looking at the dierences between
them, one can nd the amounts of time when exactly 1, 2 or 3 streams are
working.
Assuming that the ratio of time for these states is constant whether the
system is working or not, one can nd the probability of 1, 2 or 3 streams
working given that the system is working. This is a natural assumption as
long as the components all have time measured as either calendar time or
running time.
For three components in parallel, denoted 1, 2 and 3, with availability
a1 , a2 and a3 , the relevant expressions can be found by rst determining
the structure function for each of the components and then inserting the
availability of the components instead of the state variables x1 , x2 and x3 .
The structure functions are:
Conguration 1oo3:

X(t) = 1 − (1 − x1 )(1 − x2 )(1 − x3 )


= 1 − (1 − x1 − x2 + x1 x2 )(1 − x3 )
(3.29)
= 1 − (1 − x1 − x2 + x1 x2 − x3 + x1 x3 + x2 x3 − x1 x2 x3 )
= x1 + x2 + x3 − x1 x2 − x2 x3 − x1 x3 + x1 x2 x3

Conguration 2oo3:

X(t) = 1 − (1 − x1 x2 )(1 − x2 x3 )(1 − x1 x3 )


= 1 − (1 − x1 x2 − x2 x3 + x1 x22 x3 )(1 − x1 x3 )
= 1 − (1 − x1 x3 − x1 x2 + x21 x2 x3 − x2 x3 + x1 x2 x23 + x1 x22 x3 − x21 x22 x23 )
= x1 x2 + x2 x3 + x1 x3 − x21 x2 x3 − x1 x22 x3 − x1 x2 x23 + x21 x22 x23
(3.30)

Conguration 3oo3 equals a series structure:

X(t) = x1 x2 x3 (3.31)

27
3.5.1 Example
An easy example of the method described above. The system consists of
seven components, for which data is given in table 3.1. A diagram showing
the ow through the system is shown in gure 3.1. The capacities are
given in percent of required ow. It is assumed that all components are
independent. Further, it is assumed that the steady state availabilities can
be employed.
The steady state availability of the system is easily calculated from the
structure function together with the denition of the steady state availabil-
ity. If the second subsystem is assumed to have conguration 1oo3 and the
fourth subsystem has conguration 1oo2, the structure function is

X = x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − (1 − x6 )(1 − x7 )) (3.32)

When the steady state availabilities given in table 3.1 are put into this
equation, the result is A = 0.97899.
To compute the production availability of the system, the system is rst
decomposed into subsystems which are in series with each other. Thus,
there are 4 subsystems, the rst consisting of only component number 1,
the second of the three components 2, 3 and 4 which are in parallel with
each other, the third being only component number 5 and the last consisting
of components number 6 and 7, which again are in parallel. This is shown
in gure 3.2. When decomposing into such subsystems, it is important
that each component in the system only is part of one subsystem, as one
otherwise will have dependencies between the subsystems. In this case this
is no problem.
For the rst subsystem, calculating production availability is easy, as
the component will either let through 100% of the wanted production, or
none.
For the subsystem consisting of three components in parallel, the situa-
tion is that if all three components function, there is superuous capacity.
If only two streams are available, one may or may not have the demanded
capacity, depending on which streams are available. If only one stream is
available, how much capacity is available will also depend on which stream
is functioning.
To calculate the availability of the streams, rst calculate the availability
if the system has conguration 1oo3. This is given by

A1oo3 = 1 − (1 − a2 )(1 − a3 )(1 − a4 ) = 0.99999964 (3.33)

For conguration 2oo3, the equation is

A2oo3 = 1 − (1 − a2 a3 )(1 − a3 a4 )(1 − a2 a4 ) = 0.999995754 (3.34)

28
and for conguration 3oo3, which equals having a series system, as all com-
ponents must function for the system to function,

A3oo3 = a2 a3 a4 = 0.97518564 (3.35)

By looking at the dierences between these, one can nd the amount
of time when 1, 2 or 3 streams are available. Conguration 1oo3 gives the
amount of time when 1, 2 or 3 streams are available, conguration 2oo3
gives the availability of 2 or 3 streams and conguration 3oo3 gives the
availability of 3 streams. Thus 3 streams are available

A3oo3 = 0.975 (3.36)

that is, 97.5% of the time. Two streams have avilability

A2oo3 − A3oo3 = 0.024996 (3.37)

that is they are available 2.4996% of the time, and only one stream is
available
A1oo3 − A2oo3 = 3.883 · 10−6 (3.38)
To decide the amount of time each stream or combination of streams
is available when the dierent streams have dierent availabilities, the pro-
portion of the availability of the relevant stream or combination of streams
to the availability of all the streams is computed. When one stream is
available, these proportions are

0.988
s1 = = 0.332
0.988 + 0.99 + 0.997
0.99
s2 = = 0.333 (3.39)
0.988 + 0.99 + 0.997
0.997
s3 = = 0.335
0.988 + 0.99 + 0.997
When two streams are working, the proportions are
0.988 · 0.99
s1,2 = = 0.3315
0.988 · 0.99 + 0.99 · 0.997 + 0.988 · 0.997
0.99 · 0.997
s2,3 = = 0.3345 (3.40)
0.988 · 0.99 + 0.99 · 0.997 + 0.988 · 0.997
0.988 · 0.997
s1,3 = = 0.334
0.988 · 0.99 + 0.99 · 0.997 + 0.988 · 0.997
The availability of the third subsystem, which consists of a single com-
ponent is given the data table 3.1. By calculating these numbers together,
one gets the result given in table 3.9. For example, 40% ow will be avail-
able if the rst and third subsystem is functioning, and only component

29
number 2 out of the tree components of subsystem number 2. The equation
is

A40 = a1 · a5 · a1oo3−2oo3 · s1
= 0.994 · 0.985 · 3.883 · 10−6 · 0.332 (3.41)
−5
= 1.328 · 10

The same is done for all other possible ows.

Flow available Availability


40 1.328 · 10−5
50 1.34 · 10−5
60 1.332 · 10−5
90 0.00802
100 0.971

Table 3.9: Availability of dierent amounts of ow through subsystem 1 to


3 of the example

When this is to be combined with the next parallel subsystem, one


must look at each possible amount of ow from upstream and consider
what will be available downstream. First, availability must be computed
for the dierent streams of the subsystem. This is done as above for the
second subsystem, keeping in mind that there are only two streams in this
case. The results are

A2oo2 = 0.979
A1oo2−2oo2 = 0.0209
(3.42)
s4 = 0.498
s5 = 0.502

Combined with the dierent amounts of ow available from upstream, the
result for availability of dierent amount of ow becomes as described in
table 3.10. When calculating these availabilities, it is important to remem-
ber that if 90% ow comes from upstream, it may result in 50%, 60% or
90% ow downstream, depending on which components are available in the
subsystem.
The average production availability of the system is then

40 · 1.3279 · 10−5 + 50 · 0.0102 + 60 · 0.0103 + 90 · 0.00785 + 100 · 0.951 = 96.935


(3.43)
These computations could have been done by employing the merging
rules from the method above. When other methods are used to compute
the availability of the system and the streams, they will be used.

30
Flow available Availability
40 1.3279 · 10−5
50 0.0102
60 0.0103
90 0.00785
100 0.951

Table 3.10: Availability of dierent amounts of ow through the system of


the example

The usual computation of the availability if one assumes that the rst
parallel has conguration 1oo3 and the second 1oo2, is

A = a1 (1 − (1 − a2 a3 )(1 − a3 a4 )(1 − a2 a4 ))a5 (1 − (1 − a6 )(1 − a7 )) = 0.979


(3.44)
If this is used to compute the production availability by computing with the
full ow, the estimate will be optimistic as it does not account for the times
when not all streams are working, and consequently, the system cannot
produce full ow.
If in contrast one assumes that all components must be operational for
the system to function, one would ensure that full ow can pass through
whenever the system is working. To compute this availability one must
assume congurations 3oo3 and 2oo2 for respectively the rst and second
parallel. This gives an availability as for a series system:

Afull ow = a1 a2 a3 a4 a5 a6 a7 = 0.935 (3.45)

Multiplied with the full ow, one gets the result that the production avail-
ability is 93.5 percent. This is a conservative estimate as it only accounts
for the time when full ow can pass for the system. The actual production
availability will lie somewhere between these two bounds. For organisations
there may be much money saved by having more accurate estimates.

3.5.2 Computation of variability


To compute variability, the methods presented in section 2.2.2 are applied.
To compute the variability and estimate the variance, the derivative of the
system function with respect to the input parameters are needed. As the
system availability is a function of the component availabilities, and the
component availabilities are functions of the input parameters, the deriva-
tion can be written as follows
∂y ∂y ∂xi
= (3.46)
∂α ∂xi ∂α

31
Where α is an input parameter. The derivation of the structure function
with respect to the dierent component values are given by

∂y
= (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − (1 − x6 )(1 − x7 ))
∂x1
∂y
= x1 x5 (x6 + x7 − x6 x7 )(1 − x3 − x4 + x3 x4 )
∂x2
∂y
= x1 x5 (x6 + x7 − x6 x7 )(1 − x2 − x4 + x2 x4 )
∂x3
∂y
= x1 x5 (x6 + x7 − x6 x7 )(1 − x2 − x3 + x2 x3 ) (3.47)
∂x4
∂y
= x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))(1 − (1 − x6 )(1 − x7 ))
∂x5
∂y
= x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − x7 )
∂x6
∂y
= x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − x6 )
∂x7

For the availability calculations given above, the input parameters were
MTTF and MTTR. If we let αi be the MTTF, and βi the MTTR, the
component availabilities which were put into the structure function are given
by

αi
ai = (3.48)
αi + β i

The derivatives are given by

∂ai 1 αi
= −
∂αi αi + βi (αi + βi )2
(3.49)
∂ai αi
=−
∂βi (αi + βi )2

The derivatives are to be evaluated at the expected values of the pa-

32
rameters. When values are inserted, the results are

∂y
= 0.98490
∂x1
∂y
= 2.9370 · 10−5
∂x2
∂y
= 3.5244 · 10−5
∂x3
∂y
= 1.1748 · 10−4 (3.50)
∂x4
∂y
= 0.99390
∂x5
∂y
= 0.0068536
∂x6
∂y
= 0.013707
∂x7

and the derivatives of the component availabilities evaluated at the expected


values are

∂a1
= 1.1857 · 10−5
∂α1
∂a2
= 1.8999 · 10−5
∂α2
∂a3
= 2.4507 · 10−5
∂α3
∂a4
= 4.0584 · 10−6 (3.51)
∂α4
∂a5
= 2.4267 · 10−5
∂α5
∂a6
= 2.7232 · 10−5
∂α6
∂a7
= 1.0964 · 10−5
∂α7

33
and for the derivatives with respect to the MTTR, the results are
∂a1
= −0.0019762
∂β1
∂a2
= −0.0016284
∂β2
∂a3
= −0.0024507
∂β3
∂a4
= −0.0014204 (3.52)
∂β4
∂a5
= −0.0016178
∂β5
∂a6
= −0.0019452
∂β6
∂a7
= −0.0016447
∂β7
To nd the variance of y , equation (2.11) is employed. As the com-
ponents are assumed to be independent, only the variance terms of the
equation are dierent from zero, and the equation becomes
n n
∂y 2 ∂y 2
X   X  
Var(y) = Var(αi ) + Var(βi ) (3.53)
∂αi X 0 ∂βi X 0
i=1 i=1

The variances of the parameters are given in table 3.11. When the values
are put into the equation above, the result is

Var(y) = 1.3881 · 10−8 + 4.2216 · 10−6 = 4.235481 · 10−6 (3.54)

This is a very small variance, and that is to be expected, as it is estimated


from the variation in the input parameters. It shows that the uncertainty
in the input parameters do not inuence the result signicantly unless the
variation is large.
Computation of condence interval is not obvious as the distribution of
y is not a common distribution, but a scalar. If it is assumed that the errors
are normally distributed, the 95% condence interval becomes

[0.97899−1.96·4.2355·10−6 , 0.97899+1.96·4, 2355·10−6 ] = [0.97898, 0.97900]


(3.55)

3.6 Renewal processes and their application in re-


liability calculations
A renewal process is a counting process. The most common example of
a counting process is the Poisson process, which describes the number of

34
Component MTTF Var(MTTF) MTTR Var(MTTR)
1 500 25 3 0.5
2 600 22 7 0.7
3 400 16 4 0.3
4 700 30 2 0.2
5 600 18 9 0.9
6 500 15 7 0.5
7 600 20 4 0.8

Table 3.11: The variances of the MTTR and MTTF for the example data

occurances of a specic situation in an interval, for example the number of


customers arriving at a store during one hour. In the Poisson process, times
of arrival are recorded, and the time between arrivals are iid and exponential
A number of statistics are of interest when considering a renewal process.
Let the time between renewal number i − 1 and i be denoted Ti . Then the
time to the nth renewal is
Xn
Sn = Ti (3.56)
i=1

The distribution of Sn will of course depend on the distribution of the Ti s,


and is found by convolution, see [11], but as this is often time consum-
ing, approximations are often used. Further, one may be interested in the
number of renewals up to a specic time. This is denoted

N (t) = max{n : Sn ≤ t} (3.57)

and the renewal function W (t) and density w(t) = W 0 (t), where

W (t) = E(N (t)) (3.58)

N (t) and W (t) can be derived from Sn , see [11].


Renewal processes can be utilised in reliability theory when one consid-
ers a repairable component. The component will fail after a time, but as
perfect repair is often assumed it is reasonable to assume that the time to
failure will have the same distribution when repairs have been performed.
If an exponential distribution for the time to failure is assumed and time to
repair is considered to be neglectible, we have a Poisson process.
If time to repair cannot be neglected, one has what is called an alternat-
ing renewal processes, see [11]. Alternating renewal processes are systems
where two processes succeed each other so that time to the rst event is
decided by the rst process, time to the second event by the second pro-
cess, time to the third event by the rst process again, and so on. In this
manner, time to failure can be modeled by one renewal process and time

35
to repair by another. In this situation, a renewal is considered to happen
when a component is returned to operating condition, that is,

Ti = Ui + Di (3.59)

where Ui is the uptime of the component, or time to failure, and Di is the


downtime or time to repair.
In reliability, quasi-renewal processes are also relevant. These make
it possible to model systems which have increasing or decreasing MTTF,
that is increasing or decreasing failure rate. It is thus not possible to use
a quasi-renewal process if the time to failure is exponentially distributed.
The quasi-renewal process is dened by

T1 = Z1
T2 = αZ2
T3 = α2 Z3 (3.60)
....
..
Tn = αn−1 Zn

Where Ti are the successive times to failure and Zi , i = 1 . . . n are iid.


With α > 1, one gets a process with increasing MTTF, while α < 1 will
give decreasing MTTF.
Let fi (t) denote the distribution of Ti . It can then be shown that, see
[14],
fn (t) = α1−n f1 (t)
Fn (t) = F1 (α1−n t)
sn (t) = s1 (α1−n t)
(3.61)
rn (t) = α1−n r1 (α1−n t)
E(Tn ) = αn−1 E(T1 )
Var(Tn ) = α2n−2 Var(T1 )
Where sn is the survival function and rn is the failure rate.
In [15] and [14], results are derived for the quasi-renewal situation.
Among other results, it is shown that if the distribution of T1 is in one
of the classes IFR, DFR, IFRA, DFRA, NBU or NWU, then the distribu-
tion of Tn will belong to the same class of distributions. This is a logical
assumption for a repairable component. A failure and repair may make
the component deteriorate, but as the same factors must be assumed to be
present to cause failures, the shape of the distribution of time to failure
should be the same.
As for renewal processes, one can also have alternating quasi-renewal
processes. It is then assumed that the repair rate increases or decreases for

36
each repair. One may of course assume that the repair rate stays the same
by letting α = 1.

3.7 Renewal processes and availability


The availability of systems where the life of the components are modeled
as an alternating renewal processes can be calculated from the properties
already stated. The result for the limiting availability is the same as the
usual result, see [11]:
E(U ) MTTF
A = lim A(t) = = (3.62)
t→∞ E(U ) + E(D) MTTF + MTTR
provided it is assumed that MTTR equals MDT. Availability at a specic
point in time can be calculated using Laplace transforms, and the result is
that
1 − fU∗ (s)
A∗ (s) = (3.63)
s(1 − fU∗ · fD
∗ (s))

gives the Laplace transform of the availability. In principle, any failure and
repair distributions can be entered into this result to obtain an expression
for the availability. In practice, it will sometimes be very time consuming.
For the case of exponential time to failure and repair, the result can be
calculated relatively easily, as shown in [11]. Let the failure distribution be
fU (t) = λe−λt (3.64)
and let the repair distribution be
fD (t) = µe−µt (3.65)
The Laplace transforms of these are
λ
fU∗ (s) =
λ+s (3.66)
∗ µ
fD (s) =
µ+s
When putting this into (3.63) one gets
λ
1− λ+s
A∗ (s) = λ µ
s(1 − · µ+s )
λ+s (3.67)
µ 1 λ 1
= · + ·
λ + µ s λ + µ s + (λ + µ)
From tables of Laplace transforms one can nd that
 
−1 1
L =1 (3.68)
s

37
and  
−1 1
L = eαt (3.69)
s−α
so when inverting A∗ (s) one gets

µ λ −(λ+µ)t
A(t) = ·1+ e (3.70)
λ+µ λ+µ

By this method, one is able to compute the time dependent availability of


the system. To nd the system availability, the expression above is inserted
into the structure function of the system.

3.7.1 Example
The method above is used for the example system shown in gure 3.1 with
data as shown in table 3.1. Table 3.12 shows the availability functions for
the components in the example.

Component λ µ λ+µ a(t)


1 0.002 0.33 0.332 0.99398 + 0.006041e−0.332t
2 0.00167 0.143 0.14467 0.98846 + 0.011544e−0.14467t
3 0.0025 0.25 0.2525 0.99010 + 0.0099010e−0.2525t
4 0.00143 0.5 0.50143 0.99715 + 0.0028518e−0.50143t
5 0.00167 0.11 0.11167 0.98505 + 0.014955e−0.11167t
6 0.002 0.143 0.145 0.98621 + 0.013793e−0.145t
7 0.00167 0.25 0.25167 0.99336 + 0.0066357e−0.25167t

Table 3.12: The availability functions for the components in the example

Next, these functions must be inserted into the structure function to nd
the availability function for the whole system. However, it is probably easier
to insert the relevant time before the numbers are put into the structure
function, as one otherwise has a large amount of calculation to do. In this
case, it is wanted to calculate the availability at time t = 10000. When this
is inserted into the availability functions given in table 3.12, the results are

a1 (10000) = 0.99398
a2 (10000) = 0.98846
a3 (10000) = 0.99010
a4 (10000) = 0.99715 (3.71)
a5 (10000) = 0.98505
a6 (10000) = 0.98621
a7 (10000) = 0.99336

38
These numbers are of course approximate. It is likely that the availabilities
have converged or are close to converging to the limiting availability at time
t = 10000. When the numbers are inserted into the structure function of
the system, the resulting availability is

A(10000) = 0.97903 (3.72)

To compute the resulting production availability, the method described


in section 3.5 is applied. The resulting probability distribution for the
system is given in table 3.13

x P(X = x)
0 0.020969
40 1.17072 · 10−6
50 0.0098786
60 0.0099510
90 0.0077358
100 0.95147

Table 3.13: Probability distribution for the ow through the system calcu-
lated with a renewal process.

The production availability estimated from these numbers are:

40 · 1.17072 · 10−6 + 50 · 0.0098786 + 60 · 0.009951 + 90 · 0.0077358 + 100 · 0.95147


= 96.9343
(3.73)

To compute condence intervals for the availability, the variance is es-


timated by equation (2.11). The derivatives

∂A
(3.74)
∂ai

are as shown in equation (3.47). The derivatives

∂ai
(3.75)
∂αi
where αi is an input parameter, are not the same. The derivatives in this
case are
 
∂ai µi 1 − tλi λi
=− + − e−(λi +µi )t
∂λi (λi + µi )2 λi + µi (λi + µi )2
  (3.76)
∂ai 1 µi λi tλi −(λi +µi )t
= − − + e
∂µi λi + µi (λi + µi )2 (λi + µi )2 λi + µi

39
These derivatives are to be evaluated in the expected values of the pa-
rameters µi and λi . t is considered to be a constant, and is set to the value
t = 10000 as this is the value used in the calculations of the availability.
New values must also be calculated for the derivatives in equation (3.47).
The derivatives evaluated at the expected values are given in tables 3.14
and 3.15

Component
h i h i h i
∂ai ∂ai ∂A
∂λi X ∂µi X ∂ai X
0 0 0
1 −2, 9939033 0, 0181449 0, 98495948
2 −6, 8324911 0, 0797920 0, 0000276233
3 −3, 9211842 0, 0392118 0, 0000321993
4 −1, 9886089 0, 0056874 0, 0001118503
5 −8, 8210372 0, 1339194 0, 9938886620
6 −6, 8014269 0, 0951249 0, 0065013547
7 −3, 9470907 0, 0263666 0, 0135020604

Table 3.14: The derivatives evaluated at the expected value of the param-
eters.

Component
h i h i
∂A ∂A
∂λi ∂µi
1 −2.9489 0.017872
2 −1.8873 · 10−4 2.2041 · 10−6
3 −1.2626 · 10−4 1.2626 · 10−6
4 −2.2242 · 10−4 6.3614 · 10−7
5 −8.7671 0.13310
6 −0.044219 6.1845 · 10−4
7 −0.053294 3.5601 · 10−4

Table 3.15: The derivatives evaluated at the expected values

The variance is calculated to be


n n
∂y 2 ∂y 2
X   X  
Var(y) = Var(αi ) + Var(βi ) (3.77)
∂αi X 0 ∂βi X 0
i=1 i=1

when numerical values are inserted, the resulting variance is Var(y) =


1601.02 and the standard deviation is σ = 40.01. The condence inter-
val becomes very large due to the large derivatives, so large that it has no
meaning. This is due to the values of some of the derivatives being quite
large, as can be seen in table 3.15. In addition, the variances of the input
parameters are not very small. The results show that it is very important
to have as small variance as possible in the input parameters. It also shows
that this method may not be the best for this kind of availability calculation.

40
3.8 Quasi-renewal processes
For quasi-renewal processes, results for availability of several models are
shown in [15].
The model most similar to the case in question is the model described
in [15, Section 4.2.3]. The assumptions made for this model are:

• A unit starts working at time 0

• Upon failure it is imperfectly repaired, in the sense that the next time
to failure will be a fraction α of the previous time to failure

• Repair time increases with a factor β for each failure

• After the k th failure, the system is imperfectly maintained in the sense


of the (p, q)-rule at times T , 2T ,...

• Failures between maintenance are imperfectly repaired in the sense


that the next time to failure will be a fraction δ of the previous time
to failure

• Repair time for failures between maintenance is assumed to be negli-


gible

Of course, quasi-renewal processes are not meaningful with an exponen-


tial distribution as the exponential distribution is memoryless.
In [15] it is shown that for the above model, the asymptotic average
availability is given by
µ(1−αk−1 )
1−α + Tp
A(T, k; α, β, λ, p) = k−1 ) (3.78)
1−αk−1
1−α + η(1−β
1−β + Tp +w

Where µ is expected life time and η is expected repair time of a new unit.
If maintenance is not taken into account, the availability is given by
µ(1−αk )
1−α
A(k; α, β) = µ(1−αk ) η(1−β k )
(3.79)
1−α + 1−β

This is found by using the principles for renewal-reward which can be found
in [11].
According to [14], the assumptions for this formula are that the planning
horizon is innite, the failure rate is continuous, monotonously increasing
and dierentiable, the cumulative distribution of the rst time to failure
is absolutely continuous, F (0) = 0 and the unit begins to operate at time
0. These assumptions are true for the Weibull distribution with shape
parameter 1.5.

41
For a such a distribution, the probability distribution is
1.5
f (t) = 1.5λ1.5 t0.5 e−(λt) (3.80)

and the cumulative distribution is


1.5
F (t) = 1 − e−(λt) (3.81)

It is easy to see that F (0) = 0, and it is innitely dierentiable as et


is innitely dierentiable. The failure rate for the Weibull distribution is
given as
αλα tα−1 = 1.5λ1.5 t0.5 (3.82)
which is a monotonously increasing function.

3.9 Example
For a Weibull distribution with shape parameter 1.5, the probability distri-
bution is
1.5
f (t) = 1.5λ1.5 t0.5 e−(λt) (3.83)
and the cumulative distribution is
1.5
F (t) = 1 − e−(λt) (3.84)

It is easy to see that F (0) = 0, and F (t) is innitely dierentiable as


et is innitely dierentiable. The failure rate for the Weibull distribution is
given as
αλα tα−1 = 1.5λ1.5 t0.5 (3.85)
which is a monotonously increasing function. The Weibull distribution ac-
cordingly satises the assumptions for the model above. The assumptions
regarding the component, that the planning horizon is innite and the com-
ponents start working at time 0, can easily be assumed. An innite planning
horizon does not fully comply with the other examples, but as steady state
availability is used with the other calculation methods, except for the sim-
ulation methods, it should not be too dierent. The problem is that the
availability becomes a function of the number of failures, not of time. In the
other examples, the availability has been calculated for t = 10000. Instead,
µ+λ will be used.
k = 10000
The data for the example is given in table 3.1. In addition, the factors
α and β are needed. I will let α = 0.97 and β = 1.02. By putting the
numbers into equation (3.79). The results are
The structure function for the system is

X = x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − (1 − x6 )(1 − x7 )) (3.86)

42
Component k Availability
No 1 20 0.99050
No 2 17 0.98297
No 3 25 0.98229
No 4 15 0.99597
No 5 17 0.97820
No 6 20 0.97812
No 7 17 0.99019

Table 3.16: Availability for the component computed by a quasi-renewal


process

And when the numbers for the availability is put into this, the resulting
availability for the system is

Asystem = 0.968698 (3.87)

To calculate the production availability, the method for going from


known component availability to production availability is used. First the
availability for dierent congurations of the subsystems is used. The re-
sults are shown in table 3.17

Subsystem Conguration Availability


1 1oo1 0.99050
2 1oo3 0.999999
2 2oo3 0.999984
2 3oo3 0.96167
3 1oo1 0.97820
4 1oo2 0.99979
4 2oo2 0.96852

Table 3.17: The availability for the subsystems in dierent congurations

Next, the ratio between the availabilities for the dierent possibilities
for on line components when not all streams are working. For the second
subsystem, the possibilities are when one stream are working

a2
P (comp 2 working) = = 0.331947
a2 + a3 + a4
a3
P (comp 3 working) = = 0.331717 (3.88)
a2 + a3 + a4
a4
P (comp 4 working) = = 0.336337
a2 + a3 + a4

43
Subsystem Streams available Availability
1 1 0.99050
2 1 1.5 · 10−5
2 2 3.8314 · 10−2
2 3 0.96167
3 1 0.97820
4 1 3.127 · 10−2
4 2 0.96852

Table 3.18: The availability of the dierent number of streams for the sub-
systems.

When two streams are working, the pos


a2 a3
P (comp 2 and 3 working) = = 0.330344
a2 a3 + a3 a4 + a2 a4
a2 a4
P (comp 2 and 4 working) = = 0.334944 (3.89)
a2 a3 + a3 a4 + a2 a4
a3 a4
P (comp 3 and 4 working) = = 0.334712
a2 a3 + a3 a4 + a2 a4

For the forth subsystem, the ratios are as follows when one stream is
working:
a6
P (comp 6 working) = = 0.496934
a6 + a7
a7 (3.90)
P (comp 7 working) = = 0.503066
a6 + a7

From these numbers, the probability distributions for the dierent amounts
of ow through the subsystems can be set up. For subsystems 1 and 3, the
distributions are trivial, as they only have two possible states.
For subsystem number 2, the probability distribution is given in table
3.19, and for subsystem number 4, it is given in table 3.20. The merging
rules described in chapter 3.3 are then used to obtain the results for the
whole system. The resulting probability distribution for ow is given in
table 3.21.
The expected production availability is then

40·4.8234·10−6 +50·0.015060+60·0.015246+90·0.012043+100·0.92635 = 95.3868


(3.91)
The variance due to uncertainty in the input parameters are calculated
as in the other instances. Again, the derivatives of the component availabili-
ties with respect to the input parameters are what needs to be recalculated.
The input parameters of concern here are λi and µi . The derivatives of

44
x P (X = x)
0 1 · 10−6
40 4.9792 · 10−6
50 5.0451 · 10−6
60 4.9758 · 10−6
90 1.2833 · 10−2
100 1.2657 · 10−2
110 1.2824 · 10−2
150 0.96167

Table 3.19: The probability distribution of the ow through the second
subsystem.

x P (X = x)
0 2.1 · 10−4
50 1.5539 · 10−2
60 1.5731 · 10−2
110 0.96852

Table 3.20: The probability distribution of the ow through the fourth
subsystem.

x P (X = x)
0 0.031297
40 4.8234 · 10−6
50 0.015060
60 0.015246
90 0.012043
100 0.92635

Table 3.21: The probability distribution for the ow through the system

45
interest are
1−αk λi (1−αk )2
∂ai 1−α (1−α)2
= λi (1−αk ) k) − 2 (3.92)
∂λi
1−α + µi (1−β
1−β
λi (1−αk )
+ µi (1−β k )
1−α 1−β

and   
1−αk 1−β k
∂ai λi 1−α 1−β
= − 2 (3.93)
∂µi λi (1−αk ) µi (1−β k )
1−α + 1−β

The results when numbers are inserted are shown in tables 3.22 and 3.23

Component
h i h i h i
∂ai ∂ai ∂A
∂λi X ∂µi X ∂ai X
0 0 0
1 1, 8822521 −0, 0114076 0, 9779888477
2 4, 6346487 −0, 0541249 0, 0000691373
3 2, 1944249 −0, 0219442 0, 0000664827
4 1, 4081286 −0, 0040272 0, 0002921609
5 5, 9969422 −0, 0910445 0, 9902861927
6 4, 3010465 −0, 0601545 0, 0095049671
7 2, 6688100 −0, 0178277 0, 0211996616

Table 3.22: The derivatives evaluated at the expected values

Component
h i h i
∂A ∂A
∂λi ∂µi
1 1.8409 −0.011157
2 4.2042 · 10−4 −3.7420 · 10−6
3 1.4589 · 10−4 −1.4589 · 10−6
4 4.1140 · 10−4 −1.1766 · 10−6
5 5.9387 −0.090161
6 0.040881 −5.7177 · 10−4
7 0.056579 −3.7795 · 10−4

Table 3.23: The derivatives evaluated at the expected values

The variance is given by the equation


n  2 n  2
X ∂y X ∂y
Var(y) = Var(αi ) + Var(βi ) (3.94)
∂αi X0 ∂βi X0
i=1 i=1

The variances are given in table 3.11. When numerical values are in-
serted, the variance becomes Var(y) = 719.64 and the standard deviation is
σ = 26.83. Also in this case the variance is large, resulting in a condence

46
interval of small interest, as it will include all possible values for the avail-
ability. Again, one must conclude that this method may not be preferrable
if other options are available.

47
48
Chapter 4

Software programs for


availability calculations
4.1 The Relex software
The Relex program consists of a number of dierent modules, made for cal-
culations on reliability block diagrams (RBD), fault trees, doing FMECAs
and a range of other methods. The modules of interest in this thesis are the
reliability block module and the OpSim module. Both of these incorporates
methods for calculating reliability, availability and related features for a
system. In addition, the OpSim module contains methods for optimization
of maintenance.

4.1.1 The principles of Relex


Each of the modules in Relex contains a dierent method of implementing
the system. The components and data entered can then be accessed through
the system le. Dierent implementations of the same system can be linked
to avoid having to enter data several times. When a system is implemented,
the model can be exported in several le formats, including text and Excel.
When doing calculations on a reliability block diagram (RBD), Relex
uses a system to evaluate whether analytical solutions can be used or if
simulation is required. If simulation is to be used, it uses Monte Carlo sim-
ulation. The program uses analytical calculations whenever possible. This
is reasonable when one assumes that analytical calculations usually gives
more accurate answers than simulation. Relex applies analytical methods
unless they are unfeasible or the answer will be likely to be aected by
round-o errors due to the complexity of the model, [1]. It is however
possible to force the software to do Monte Carlo simulation.
According to the help section of the program, simulation must be used
for reliability calculations if components in the RBD are repairable. For

49
availability calculations, at least some measures may be calculated analyti-
cally also for repairable systems, as the example later in this chapter shows.
Due to the constraints on analytical methods, they will be used relatively
seldom.
When simulation is used, it is possible to calculate with repair resources,
costs, switch mechanisms, conditional repair policies and other factors com-
mon in a real world system.
For the purposes of this thesis, the systems will be modeled as relia-
bility block diagrams. This modeling methodology focuses on success of
the system, that is, there being an operational path through the system.
The method generally does not consider the ow through the system. In
Relex it is possible to assign capacities to the blocks of the RBD, and get
an estimate of the amount of ow through the system at dierent times.
Relex computes the dierent parameters of interest at a predetermined
number of points during the interval of the calculation. The number of
points is chosen by the user. This makes it possible to obtain results for the
development of the system during the simulated time period.

4.1.2 Implementation of models in Relex RBD or OpSim


Models can be implemented directly as RBD diagrams, or one may enter the
data into what is called the system sheet and later import the data to the
RBD. All kinds of RBD structures can be implemented. Each component in
the system must be modeled as a block in the diagram, and all data must be
given per block. Redundancy can be modeled very easily if the redundant
components are identical, by indicating the redundancy in the data for the
block. If dierent components are in parallel, they must be implemented as
separate blocks. The redundancy is then entered into the links between the
blocks.
For each block one can specify failure and repair distributions, supported
distributions are listed in table 4.1. If repair is to be considered in the cal-
culations, this must be specied both when entering data for the component
and when preparing calculations. A component can be repairable or replace-
able or, if OpSim is in use, both. Needed repair recourses can be assigned
and cost can be specied for the repair. Maintenance can also be specied,
with many alternatives for how it is performed, for example one can choose
whether a broken component is xed on site or transported to a repair shop,
and where spare parts are stored. Cost and time for transportation of parts
can be accounted for.
It is in addition possible simulate what happens to components that are
replaced or must be repaired; it is possible to specify repair on site or in
a separate repair shop, with transportation time and cost of repairs. One
can dene both repair and replacement for every component, and imperfect

50
repair can be modeled by the (p, q)-rule.
Generally, very complex maintenance strategies can be modeled, and
one may come quite close to the strategies for a real life system.
Relex OpSim also contains the opportunity of optimizing the number
of spare parts available and maintenance intervals with respect to cost or
downtime of the system. If optimization is done, achieved availability can
be calculated.

Distribution
Exponential
Lognormal
Normal
Rayleigh
Time independent
Weibull
Uniform
Constant time

Table 4.1: Probability distributions supported for modeling in Relex RBD


and OpSim

Capacities can be calculated when simulation is used. The capacity and


required capacity must be specied in each component. If the ow through
a component falls below the required level, the component is regarded as
failed. In addition, the type of network must be specied in the calculations
dialog. Two types of network are available, electrical network and ow net-
work. According to the help section of the program, the dierence consists
in how ow is distributed when there are components in parallel. In an
electrical network, 100% ow is assigned to each parallel, while in a ow
network, the ow is split so that if there are two parallel components, each
is assigned 50% of the ow. It is not stated whether the capacities of the
components in parallel are taken into consideration. If not, this can cause
problems if one component has less or more capacity than is assigned to the
parallel.

4.1.3 Calculations available in Relex RBD and OpSim


Calculations in Relex RBD are done either by analytical methods or by
Monte Carlo simulation. An algorithm is applied to determine whether
analytical methods are applicable, and if not simulation is used, see [2]. It
is also possible to force the program to do simulation by marking a box
in the calculation dialog. In the calculation dialog one determines which
calculations are to be done and assigns parameters for simulation. One
can choose how many system hours the calculation will consider and how

51
many points the calculation results will be shown for. For simulation one
decides how many replications are to be done and how many failures the
system must experience to reach the steady state. The random number seed
must also be specied. If two calculations are done with the same seed, the
simulation results will be identical.
It is not possible to specify sensitivity for the calculation, but the results
are shown with good sensitivity by default. Condence level for calculation
of condence intervals can be specied. Condence intervals can be calcu-
lated for reliability, availability, MTBF and MTTF.
If optimization is to be done, it is also specied in the calculation di-
alog. Optimization can be done for maintenance intervals and spare parts
with respect to cost or downtime. Costs can be specied extensively when
data is entered into the model, from cost of repairs, spare parts and other
resources to general downtime costs and costs for transporting and storing
components.

4.1.4 Availability calculations


According to the help section of the program, Relex applies the same de-
nition as is used in NORSOK Z-016 for availability. Mobilisation time for
resources is not accounted for in the availability measures.
Availability can be calculated in several ways in Relex OpSim. It can
be analytical or simulated, and one can choose availability and operational
availability, the dierence being that operational availability takes into ac-
count the logistic delay time. In addition to the time dependent availability,
Relex can calculate mean and steady state availability measures.
For a single unit with exponential failure time, the availability is calcu-
lated as
µ λ −(λ+µ)t
A(t) = + e (4.1)
λ+µ λ+µ
according to the help section of the program.
Mean availability is calculated according to the denition of interval
or mission availability given in equation (2.3). The integrals involved are
calculated by numerical integration. Steady state availability is calculated
as the limit of the availability as time goes to innity.
The analytical computation method is described in the help section
of the program, and is as follows: For a series system the availability is
computed as Y
A(t) = Ai (t) (4.2)
i

and for a parallel system as


Y
A(t) = 1 − (1 − Ai (t)) (4.3)
i

52
These are basically the same equations as the ones used when applying
the structure function to calculate availability, as described in chapter 3.1.
If the availability is calculated by simulation, the availability is calcu-
lated from the simulated failure and repair histories of the components.

4.1.5 Algorithm for Monte Carlo simulation


The simulation algorithm is presented in the help section of the program
on two levels. The general method is:

1. The system is simulated the given number of times

2. The total number of system successes is counted

3. Calculate the expected reliability as system successes divided by sys-


tem runs

4. Calculate other results

The individual simulation runs are done according to the following


method

1. Find initial state of system

2. Generate random numbers from the specied probability distributions


for each event that can occur in the system

3. Find next event

4. After every state change, repeat 1 and 2, with ages of components


updated

5. Continue until the mission time or required number of failures has


been reached

6. Find the vector of system state changes, use it to calculate the required
measures for each simulation

It is also stated that the OpSim package performs simulation in a some-


what dierent manner, as it simulates the system only one time for each
simulation iteration, until the required number of system failures to reach
steady state has occur
Inverse transformation is used for drawing random numbers from proba-
bility distributions. For the normal and lognormal distributions that is not
possible, but they can easily be drawn from in other manners, especially the
Box-Cox algorithm. It is not stated whether this is the method employed
by Relex.

53
4.1.6 Method for calculating steady state availability, fail-
ure frequency and MTBF
This method is described in [7], and in [8] it is stated that the method has
been implemented in Relex. According to [7], the method applies to general
failure distributions, and can be used with implicit decomposition methods.
The algorithm is as follows, when φi denotes the failure rate and ψi denotes
the repair rate of component i:

• Decompose the system into modules

• Find conditional failure and repair intensities for all components.


They are always given by λci = φ1i and µci = ψ1i

• Find the steady state availability of the components. It is given by


Ai = φiφ+ψ
i
i

• Find the steady state failure frequencies given by νi = 1


φi +ψi

• Combine the found availabilities and failure frequencies to nd the


availability

• Continue until the top level is reached. Then, the MTBF is given by
MTBF = ν1 .

4.1.7 Capacity calculations


The capacity calculation is not the main focus of the Relex RBD and OpSim
packages. The ow is calculated by recursively using the rules that for a se-
ries system, the capacity is the minimum of the capacity of the components,
and for a parallel system, it is the sum of the capacity of the components.
However, the algorithm might not be able to take into account ramp-up,
variation in amount of ow, and so on. Relex does not include any method
for maximizing the ow through the system.

4.1.8 Computation of condence intervals


Condence intervals are only computed for simulated results. For avail-
ability, the method of computing condence intervals is somewhat unclear.
According to the help section of the program, the computations are done
as follows:
Let n be the number of simulation runs, and let ns be the number of
successful runs, that is, runs in which the system did not fail. The relevant
probability, p, is then approximated by p̂ = nns , and the upper and lower
condence limits are found by using the binomial equation, or by using the
normal approximation to the normal equation. This method can be used

54
for reliability and availability if repair is not taken into account. Otherwise,
the condence interval for availability assumingly is computed in some other
manner.
It is important to remember that Relex considers the availability of the
system as a whole, not of the components of the system. Downtime comes
only when a cut set of the system has failed, and ends as soon as the system
is up again.
For MTTF and MTBF, the condence intervals are computed by esti-
mating the mean and variance of the time to failure of the system from the
times computed at each simulation.

4.1.9 Comparison of Relex and other reliability softwares


The article [5] compares the performance of the Relex RBD module with two
other software packages designed for reliability evaluation, namely ARINC
Raptor and Reliasoft BlockSim. These softwares are not a part of this
project, but it was interesting to note that the results obtained from Relex,
especially for availability, diered from the results obtained with the other
software packages. In the article, four dierent models are implemented and
simulated in each of the softwares. It seems the dierences in the results
grows larger as the model complexity increases.
The article does not aim to range the softwares, only to point out dif-
ferences and caution the users of reliability softwares to the fact that it
is very important to know the assumptions and methodologies applied by
the software in use. It is possible that the dierences in results between
Relex and the two other softwares stem from the fact that Relex employs
analytical methods as far as possible.

4.1.10 The example in Relex


The same example as was used before was implemented in the OpSim pack-
age. Figure 4.1 shows the diagram as it is rendered in Relex. Exponential
failure and repair time was assumed, with data as shown in table 3.1. No
costs, required resources or maintenance was implemented.
When left to decide for itself, the software used analytical methods for
computing the availability. Then no condence intervals were computed.
The calculations were done for 10000 system hours, and shown at 11 points.
When simulation was forced, 10000 iterations were done, and 50 failures
were required for the system to reach steady state.
When analytical methods were used, the results were as shown in table
4.2
And when the same model was simulated, the results became as shown
in table 4.3

55
Figure 4.1: The model as shown when implemented in Relex

Measure Value
MTTF 266.772885
Steady state availability 0.979256
Availability (at time 10000) 0.979256
Mean availability (at time 10000) 0.979271
Failure frequency (at time 10000) 3626.067796
Capacity (at time 10000) 95.94
Mean capacity 96.140747

Table 4.2: The results obtained with analytical calculations performed with
Relex OpSim

Measure Value
MTTF 275.451966
Steady state availability 0.979206
Availability (at time 10000) 0.9808
Availability LCL 0.97811
Availability UCL 0.98349
Mean availability (at time 10000) 0.979207
Failure frequency (at time 10000) 3636.1
Capacity (at time 10000) 96.328333
Mean capacity 96.140972

Table 4.3: The results obtained with Monte Carlo simulation performed
with Relex OpSim

56
As can be seen, the steady state availability is very similar, and the
analytically calculated availability at time 10000 lies inside the condence
interval of the simulated availability. The mean availability is also very
similar, as is the mean capacity. The MTTF and failure frequencies are
however somewhat dierent.
To be able to estimate the production availability of the system, calcu-
lations were done with dierent congurations of the parallel subsystems.
For the above calculations, the congurations were respectively 1oo3 and
1oo2. In table 4.4, all the dierent results for the steady state and mean
availability are summed up. These numbers can be used to compute the
dierence in availability for dierent congurations, and this can again be
used to calculate an approximation of the production availability.

1st parallel 2nd parallel Method Mean Steady state


1oo3 1oo2 Analytical 0.979271 0.979256
1oo3 1oo2 Simulation 0.979207 0.979206
1oo3 2oo2 Analytical 0.959454 0.959428
1oo3 2oo2 Simulation 0.959487 0.959405
2oo3 1oo2 Analytical 0.9791 0.979085
2oo3 1oo2 Simulation 0.979128 0.979079
3oo3 1oo2 Analytical 0.955676 0.955649
3oo3 1oo2 Simulation 0.955714 0.95569

Table 4.4: Mean and steady state availability for dierent congurations of
the parallel subsystems

To approximate the production availability, one must know how many


streams are available. As described in chapter 3.5, the dierent possible
congurations of the parallels in the systems give numbers for dierent
number of streams. With conguration 3oo3, all streams in the parallel
must be online for the system to be available. This availability gives the
availability of 3 streams. Conguration 2oo3 gives the availability of 2 or
3 streams. By subtracting the availability of 3 streams, the availability of
exactly 2 streams is found. In the same manner, the availability of exactly
1 stream is found by subtracting the availability with conguration 2oo3
from the availability with conguration 1oo3. The same can be done for
the fourth subsystem. The dierences in steady state availability for the
rst subsystem are, when the simulated results are used:

A1oo3 − A2oo3 = 1.27 · 10−4


(4.4)
A2oo3 − A3oo3 = 0.023389

Second subsystem:

A1oo2 − A2oo2 = 0.019801 (4.5)

57
In the current situation, there are no calculations for the subsystems
alone, only for the system as a whole. For all calculations for the second
subsystem, the conguration for the fourth subsystem is set to be 1oo2.
Thus all the time the system is available and the current conguration of
the second subsystem is available is accounted for. As the subsystems are
connected in series, the time when not all subsystems are online is not in-
teresting. To nd the time when only one stream is online in the fourth
subsystems for each of the conguration of the second subsystem, it is as-
sumed that the percentage of time with only one stream available is the
same independent of the conguration of the second subsystem. As the
components are assumed to be independent, this is a logical assumption.
The percentage of time with only one stream available in the second sub-
system is

A1oo2 − A2oo2 0.019801


× 100% = × 100% = 2.02215% (4.6)
A1oo2 0.979206
Thus we have availabilities listed in table 4.5

2nd subsystem 4th subsystem Availability Capacity


3 2 0.936365 100
3 1 0.019325 55
2 2 0.022916 100
2 1 4.7296 · 10−4 55
1 2 1.2443 · 10−4 50
1 1 2.5681 · 10−6 50

Table 4.5: The availability of dierent number of streams for the subsys-
tems, with the average capacity for the relevant number of streams.

There are no calculations available for the availability of the individual


components. If it is assumed that the average amount is let through when
less than all the components are online, for the rst subsystem we get that
the average when only one stream is working is
50 + 60 + 40
= 50 (4.7)
3
and when two streams are working the average is
110 + 100 + 90
= 100 (4.8)
3
For the second parallel, the average amount of ow if only one stream is
working is
50 + 60
= 55 (4.9)
2

58
The capacities for the dierent combinations of available streams are shown
in table 4.5. The production availability can from these numbers be esti-
mated to
100·(0.936365 + 0.022916) + 55 · (0.019325 + 4.7296 · 10−4 ) + 50 · (1.2443 · 10−4 + 2.5681 · 10−6 )
= 97.0233
(4.10)

4.2 Description of the reliability software MIRIAM


Regina
4.2.1 General principles
MIRIAM Regina is primarily developed for the calculation of production
availability. The software combines a ow algorithm with Monte Carlo
simulation, and makes it possible to simulate highly complex ow networks.
The description of the software is based on [3].
A system to be simulated in MIRIAM is entered as a ow network with
system stages and items. Each system stage can consist of several items.
Flow is dened in the system stages, and reliability parameters in the items.
Maintenance stragegies can be simulated. Spares and repair crews can
be dened and limited. Mobilization time for resources is also a parameter.
It is however not possible to model a system where sometimes a repair is
performed and sometimes the broken component is replaced with a new
component. One can model imperfect repair with the (p, q)-rule.

4.2.2 Implementation of a system in MIRIAM Regina


A system is entered as a ow network consisting of system stages and items,
and data on maintenance and failure are entered for these. In addition, it
is possible to enter system wide objectives. Quality constraints for the
ow can also be modeled. There are many possibilities for modeling a ow
which varies during the lifetime of the system. Both calendar proles, which
give the ow during the simulated time, and seasonal proles, which give a
variation in ow over a given period of time to be repeated throughout the
simulation, can be implemented. In addition, for each system stage, ramp
up can be modeled, that is, it can be modeled that after down time, the
ow through the system stage must increase gradualy.
It is possible to have several input and discharge points in the network.
Calendar and seasonal proles can be dened for the ow. It is important
to note that the reference level for the production availability is dened
by the required ow throuth the discharge points, and if the capacities
throughout the network are unbalanced so that the discharge points do
not recieve enough ow even when all components are working perfectly,

59
the production availability will be less than 100% even if the system was
working perfectly during the whole simulation.
When implementing a network in MIRIAM Regina, one rst denes all
system stages. Each system stage can contain several items, but the ow
capacity is dened in the system stage. Failure and repair data is dened
in the items. A component in a system to be modeled will typically be
represented by an item, while a system stage will be a collection of related
components with the same ow capacity.
It is also possible to dene storage tanks in the system. These can
be dened to compensate for lack of upstream ow during downtime. If
a system has several discharge points, they can be dened to compensate
for each other at times when some discharge points achieve less than the
demanded ow.
As the ow through the system is the central object of modeling, there
are many opportunities for specifying properties of the ow. Several ows
can go through the same system. Flows can be split in system stages so that
two ows go out where only one went in (for example, in a seperator gas
and oil will be seperated into two dierent ows, and this can be modeled).
Quality constraints can also be dened. Proportional ow rules makes it
possible to dene relative amounts of ow through parallel stages.
Maintenance can be dened for the system. Several maintenance pro-
grams can be dened, and system stages choosen to be members of the
program.
For repair and maintenance, spare parts and other required resources can
be dened, with mobilization time for the resources. The number of avail-
able resources can be limited. Maintenance and repair can be implemented
to take place immediately or be postponed until a zero ow opportunity. A
maximal deferal time must then be stated. However, MIRIAM has no way
of determining the length of the zero ow opportunities in advance, so this
may also cause downtime for the streams.
It is also possible to dene objectives for the model. These are not ab-
solute requirements, but goals to be achieved if possible. For example, it
can be an objective that one specic discharge point should always recieve
as much ow as required. The objectives are dened in a prioritized list. If
objectives conict, the objective highest on the list is prioritized. Another
modeling feature is simulation rules, which can be used to model dependen-
cies. For example, it can be a rule that faliure of one item causes failure of
other items. If rules conict during simulation, this may cause unwanted
results, see [3].
When modeling a system in MIRIAM Regina, it is important to keep
the system as simple as possible to keep the simulation time down. Every
time a change happens in the system, the ow algorithm is called, and if
there are many limitations and objectives in the system, the ow algorithm

60
will be time consuming.
A useful feature in MIRIAM is the possibility of modeling auxilary sys-
tems, that is, systems which are not a part of the ow network, but on which
items in the ow network depend. Examples can be electric system or water
distribution. Items can be modeled to fail or have degraded operation when
the required amount is not produced by the auxilary system.

4.2.3 The simulation algorithm


MIRIAM Regina uses next event Monte Carlo simulation. The principle
is that rst, time to failure is drawn for all components in the system.
These are then ordered. When an event occures, the time to repair for the
component is drawn, and so on. Upon each event, the ow algorithm is
called to calculate the ow through the system. An event can in addition
to a failure or a repair be maintenance, alteration of ow due to step up or
variation of the prole ow entered into the system.

4.2.4 Calculation of production availability


The production availability of a model simulated in MIRIAM is calculated
by recording the ow through the system during the simulation and com-
paring it to the ow gotten if the system had been fully operational at all
times. It is important to note that this value is calculated from the ow
proscribed to be wanted throuth each discharge point. If the value does not
correspond to the ow it is possible to get through the network, the produc-
tion availability will be smaller than 1 even if the system operates perfectly
at all times. This is a point to be noted during implementation of a system.
In all cases the accuracy of the result will depent on the implementation of
the system being according to the system caracteristics.

4.2.5 Available calculations


The main calculation done by MIRIAM is the calculation of production
availability. In addition to the availatility for the entire system, the avail-
ability for each boundary point and each ow type is given. The results for
each simulation are available, and the averages. Condence intervals are
not computed automatically, but can easily be estimated from the results
for the simulation runs.
It is important to note that if buer storage units are incorporated into
the model, the production availability computed by MIRIAM Regina is not
strictly calculated according to the denition given in NORSOK Z-016, as
this denition does not include buer storage, see [3].
MIRIAM also computes on stream availability, deliverability and de-
mand availability, where the reference level is the demanded amount of ow

61
through the discharge points. The amount of time when more than 0%,
20%, 40%, 60%, 80% and 100% of system ow is available are also com-
puted automatically for each boundary point. From these, % availability
can easily be computed. Deliverability is computed in accordance with the
denition in NORSOK Z-016, given that all possibilities for substitution
from others and buer storage are modeled. Otherwise, these aspects of de-
liverability must be considered separately for the measure to comply with
the denition, see [3].
On stream availability is computed as the percentage of time the ow
through the system is larger than 0. This measure can be compared to the
availability calculations done with other methods.
In addition to the availability measures, it is possible to obtain numbers
for most of the features modeled in the system, such as amount of spare parts
used and to which extent rules have been fullled. A blaming algorithm is
also available. It assigns blame for downtime to the dierent components.
However, it can not fully be relied on as in some cases, the wrong component
may be blamed for the downtime, see [3].

4.2.6 Uncertainty measures


MIRIAM Regina computes the standard deviation for the deliverability. As
mentioned above, the deliverability will in many cases be identical to the
production availability. To compute condence intervals, this measure can
be used. It is also possible to chose how many decimals shall be used in
calculations, the default being 3. The results for each boundary point and
each stream for every simulation iteration are available in MIRIAM, and
condence intervals for some other measures may be computed from these.
The main limitations of the software is in the time required for running
simulations. This puts limits on many features of modeling, which would
otherwise make it possible to simulate a model quite close to the real world
system. For large systems, the assumptions must be extencively simplied
and the number of simulations limited, and this can limit the accuracy of
the results.
It is in MIRIAM possible to simulate some dependence between com-
ponents, for example by letting failure in one component cause failure of
others. Thus it is not always necessary to assume that components are
independent, as it is in many other models. However, this posibility is lim-
ited as many such rules will make the simulation time excessive. When
implementing a model, one must choose carefully which dependencies are
important for the operation of the system.

62
4.2.7 Implementation of the example in MIRIAM Regina
The example used in previous sections was implemented in MIRIAM Regina.
Figure 4.2 shows the ow diagram as it is shown in MIRIAM.

comp2

comp6

inn comp1 comp3 comp5 ut

comp7

comp4

Figure 4.2: The model as shown when implemented in MIRIAM Regina

Data was as shown in table 3.1, and exponential distribution was as-
sumed both for the failure and repair time.
Simulation was done over 10000 system hours. 1000 iterations were
performed. The results are shown in tables 4.6 and 4.7.

Measure Value
Production availability 96.95%
On stream availability 97.90%
Deliverability 96.95%
Demand availability 94.99%

Table 4.6: The results from the simulation in MIRIAM Regina

Percent ow Percent time


0 2.10
0 − 20 0
20 − 40 0
40 − 60 0.65
60 − 80 1.32
80 − 100 0.93
100 94.99
> 100 0

Table 4.7: Percentage availability calculated by MIRIAM Regina

For the deliverability, the standard deviation is calculated automatically.


In this case, it was σ = 0.59%. For this model, deliverability is identical

63
to production availability, as neither storage units nor substitution is mod-
eled. For the production availability, the variance can also be calculated
from the results from each simulation, by applying the standard formula for
estimating variance,
n
1 X
Var(y) = (y − ȳ)2 (4.11)
n−1
i=1

The resulting estimated variance is

Var(y) = 0.000034314 (4.12)

which gives the standard deviation σ̂ = 0.0058578. Rounded to three sig-


nicant ciphers, this is the same as was calculated automatically for deliv-
erability.
A condence interval for the production availability can be calculated
based on the central limit theorem, as described in section 2.2.2. For the
production availability, a 95% condence interval is
σ σ
[ȳ − 1.96 · √ , ȳ + 1.96 · √ ]
n n
0.0058578 0.0058578 (4.13)
= [0.96947 − 1.96 · √ , 0.96947 + 1.96 · √ ]
1000 1000
= [0.96912, 0.96982]

64
Chapter 5

Comparison of methods and


results
5.1 Comparison of Relex and MIRIAM Regina
MIRIAM Regina and Relex are very dierent softwares. While Relex fo-
cuses on system failure or success, MIRIAM focuses on the ow through
the system. While MIRIAM focuses almost exclusively on the production
availability of the system, Relex provides a number of dierent results, in-
cluding availability and an estimate of capacity. Which software is best
depends entirely on the needs of the user. If production availability and
the ow through the system is what is in focus, MIRIAM would be the
best choice. Are other results wanted, Relex provides more opportunities
for modeling and calculating dierent measures of the system.
It is not straightforward to compare the results of the softwares, as they
focus on dierent aspects of a system. It will however be possible to compare
the amount of time with no ow through the system in MIRIAM with the
availability calculated in Relex.
Both softwares contain possibilities for including maintenance and spare
parts in the models. An advantage in Relex is the possibility for optimizing
maintenance intervals and spare part stocks.
Relex calculates a series of dierent statistics for the model: The MTBF,
MTTF, availability and reliability and gives condence intervals for all
these. MIRIAM only gives the standard deviation for the production avail-
ability.

5.1.1 Implementation of models


Implementation of models is a simple task in both softwares, provided that
the system is well understood in advance of implementation.
The principles of implementation are somewhat dierent. In MIRIAM,

65
where the focus is on system ow, the system is modeled in two stages.
First the ow network is implemented. The ow network consists of system
stages in which the types of ow and capacity is decided. Each system stage
can then have a number of items, which are the components of the system.
Each item has separate failure and repair data. Maintenance is however
implemented for system stages.
In Relex, each component must be modeled as a separate item in the sys-
tem. All data must be entered for each component, including maintenance
data.

5.1.2 Advantages and disadvantages of the two softwares


MIRIAM Regina provides a good software for modeling a ow network and
computation of production availability and related measures. It is a large
advantage that the results from each simulation can be obtained. The sim-
ulation methodology is straight forward and quite easy to understand. The
ow algorithm is the basis for accurate calculation of the dierent measures,
and a large part of the advantage of MIRIAM Regina. It does however make
simulation time consuming, and limits the number of iterations feasible for
large systems. The advantage of MIRIAM is that the computations made
are intuitive and easily understood, and of course the complex ow algo-
rithm which makes it possible to achieve accurate results for highly complex
models.
Relex Architect makes it possible to compute a large number of dierent
measures for the system. It is possible to compute the capacity of the
system, but the algorithm is not as advanced as that of MIRIAM. The
advantage of Relex is the large number of possible computations, and the
possibility of adding data about cost to the model. Simulation is somewhat
faster in Relex than in MIRIAM.
In both MIRIAM and Relex, it is possible to implement advanced re-
pair and maintenance strategies for the model. Relex does contain some
possibilities in this eld which are not present in MIRIAM. For example,
in Relex it is possible to dene data for both repair and replacement of a
component, and a replaced component can be repaired and become a spare
part.
In Relex, it is also possible to do optimization of maintenance intervals
or based on cost or system downtime. This is not possible in MIRIAM.
Both MIRIAM Regina and Relex Architect are good and reliable soft-
wares for their uses. MIRIAM has a somewhat more limited eld of ap-
plicability, but is better than Relex at those tasks it is made for. Relex
gives a wider range of options for computation, and gives the possibility of
applying analytical computations when feasible.

66
5.1.3 Uncertainty in the computations
When simulation is used, the uncertainty in both MIRIAM and Relex will
primarily depend on the number of simulations done, in addition to the
simplications made when implementing the model. One simulation run
does however not necessarily mean the same in MIRIAM and in Relex. In
MIRIAM, a simulation run consists of simulating the entire period given in
the simulation parameters, regardless of the performance of the system. In
Relex, it depends on which module is applied. In OpSim, the simulation
methodology resembles that of MIRIAM, in the RBD module, a simulation
iteration will end when the system as a whole has failed, according to the
help section of the program [2]. The computation of condence intervals in
Relex is not well documented except in a few special cases. It must be as-
sumed that valid statistical methods are employed also when documentation
is lacking.
In Relex, the output consists only of the end results with condence
intervals, but it is possible to have results computed at many points of the
simulated time. In MIRIAM, one can obtain some results from each simula-
tion iteration, but only mean measures over the entire simulated time. Stan-
dard deviation is computed for the deliverability, but condence intervals
must be calculated from this. One can also take advantage of availability
of the results from each simulation iteration to compute variance.
Relex also provides the possibility of applying analytical computations
when the system makes this possible. This is an advantage, as analytical
computations do not depend on the number of simulations, which is often,
at least for complex systems, decided at least in part by the time available.
However, for complex systems analytical calculations are often unfeasible.
Whether analytical calculation is possible also depends on which calcula-
tions are to be done. Simulation can be forced even in the cases where the
software would choose analytical computation.
When it comes to the model assumptions, the programs are quite sim-
ilar. By default, it is in both programs assumed that the components are
independent. In MIRIAM there are several ways of modeling dependencies
between components, but these will rapidly increase the simulation time.
In Relex, there are no such features. This is an advantage for MIRIAM. In
many ways, MIRIAM is more exible when it comes to assumptions on the
ow system modeled, while Relex is more exible when modeling repair and
maintenance. Again, the choice of software must depend on which features
are considered most important. Also for repair and maintenance modeling,
Relex and MIRIAM oers many similar features, but all in all there are
more pos
All in all, which software is preferable depends on the computations
to be done and which system features are considered most important. If

67
system ow and production availability is the important issue, MIRIAM
Regina would be the best software. If measures like cost and general system
availability are the important points, Relex oers more opportunities.

5.2 Comparison of the analytical methods


Four analytical methods for computation of availability and production
availability are presented in this thesis. Three of them are made for com-
putation of availability, only one for production availability.
Which method should be chosen depends of course on what is to be cal-
culated. The most exible method is the method presented in section 3.3
for computation of production availability. The exibility of this method is
due to the application of Markov Chains for computing the probability dis-
tribution for the ow through the system. Markov chains are very exible,
and can account for almost any assumption about the system, as long as it
is possible to nd the transition matrix for the system.
Markov chains can also be emplyed to compute availability, see [11].
This is not considered in this thesis, but would have the same advantages
as the computation of production availability by Markov chain methods.
The method for computing production availability from known avail-
ability is based on many assumptions which make the method less accurate
than one might wish. This method should not be used if other methods for
computing production availability are feasible.
For computing availability, three methods are presented in this thesis.
These are the ordinary computation of availability by employing the deni-
tions of availability together with the system function, presented in chapter
3.1, renewal methods presented in section 3.6 and quasi renewal methods,
presented in section 3.8.
The dierence in employing renewal and quasi renewal processes in stead
of the denitions of availability mainly lies in the fact that when applying
the denitions of availability, it is most often only possible to compute mean
availability or limiting availability. For renewal processes, the result for the
limiting availability is the same as the result obtained by employing the
denitions, as shown in [11]. It is however possible to compute or at least
estimate the renewal function in many cases. This will give the function of
availability in time, as is computed for the example used in this thesis in
section 3.7.1.
Quasi renewal processes are dierent in that they are made to compute
results for processes which are improving or deteriorating. This is not pos-
sible for ordinary availability computations as done here, and neither for
ordinary renewal processes. Deteriorating processes are a natural assump-
tion in the real world, as it would be very strange if components in a system
stayed as good as new throughout their lifetime when they are subject to

68
stresses from their environment.
When it comes to eort of calculation, the easiest alternatives in most
cases will be the ordinary calculation of availability, or the method from
[9] for computation of production availability. Computation of produc-
tion availability by employing the availability calculations done with other
methods is less accurate and more complicated. If the model assumptions
indicate that other methods should be used for availability calculations, the
amount of computation is not infeasible. For large systems, any analytical
computation would be time consuming.
The advantage of using analytical calculations over simulation is that the
accuracy of the results does not depend on the available time for simulation.
However, as the assumptions that have to be made about the system when
analytical calculations are to be done can be very limiting, the divergence
between the model and the real world system may become large.

5.3 Comparision of analytical and simulated re-


sults
Method Availability Condence interval
Relex simulated 0.979206 [0.97811, 0.98349]
Relex analytical 0.979256 -
MIRIAM 0.9790 -
Ordinary 0.97899 [0.97898, 0.97900]
Prod. av 0.979 -
Renewal 0.97903 [0, 1]
Quasi-renewal 0.968698 [0, 1]

Table 5.1: Comparision of the results for availability from the dierent
methods

The dierent computed values for the system availability are given in ta-
ble 5.1. All the values for computed availability are the same when rounded
to three decimal places, A = 0.979. The only exception is the quasi-renewal.
This was expected, as this measure assumes that the system is deteriorating
with each failure, while all the others assume a repair restores the system
to good as new status.
The condence intervals computed are based on dierent concepts. The
condence interval for the simulated availability from Relex is computed
from the dierent results obtained in the simulation iterations. The con-
dence intervals are based on the uncertainty in the input parameters. What
is shown by these condence intervals is that for renewal and quasi renewal
calculations, the uncertainty in the input parameters has much larger in-
uence than for the other methods. For the ordinary method of analytical

69
computation of availability, the condence interwall becomes very narrow,
which implies that uncertainty in the input parameters does not aect the
result very much.
The values for production availability are given in table 5.2. These mea-
sures vary somewhat more than the values for the availability. As MIRIAM
Regina and the method presented in section 3.3 are the only methods specif-
ically meant for computing production availability, these measures are con-
sidered to be the most accurate. They are also very similar. When rounded
to two decimal places, the results are the same, Aprod = 96.95. The methods
based on the availability for the single components underestimate this mea-
sure somewhat, and the method based on the simulated availability results
overestimate it slightly. Again, quasi-renewal gives a lower measure than
the other methods, which is to be expected. It is worth notice that the ca-
pacity calculation done by Relex underestimates the production availability,
the only method giving a smaller production availability is the calculations
based on the quasi renewal process. This may be due to the way in which
the ow is delegated through ow networks in the computations, see section
4.1.2.
The only method for which condence intervals can be computed for
production availability is the simulation done by MIRIAM. This is due to
the manner in which the computations are done in the other cases. In
Relex, one would assume that it would be possible to estimate a condence
interval for the capacity calculation if methods for such were implemented.
As they are not, and the results for the dierent simulation iterations are
not available in the software program, this cannot be done.
Propagation of uncertainty if also dicult for the analytical method
used for computing production availability. For the availability measures
employed to compute the probability distributions they might have been
calculated, but to propagate these through the capacity calculations is not
trivial.
It is worth noting that for all the analytical methods except quasi re-
newal, the estimated production availability lies within the condence in-
terval for the production availability computed by MIRIAM.
The system considered in this thesis was a very simple one. Due to
this even crude methods must be expected to give relatively good results.
For a more complicated system, the amount of work involved in computing
the production availability from methods meant to compute availability will
quickly become unmanagable. The results will probably deteriorate as the
system compexity grows, as there will be more inaccuracies.
The dierence between the methods for computation of production avail-
ability is the manner of computing the probability distribution for the ow.
Markov chains give, as stated in [9], opportunity for implementing many
features, by making the Markov chains more complex. The limitations of

70
Method Production availability Condence interval
Relex simulated 97.0233 -
Relex capacity 96.14075 -
MIRIAM 96.95 [0.96912, 0.96982]
Prod. av from av 96.935 -
Prod. av 96.94533 -
Renewal 96.9343 -
Quasi-renewal 95.3868 -

Table 5.2: Comparision of the results for production availability from the
dierent methods

this method are primarily due to the available time and resources for the
calculation. For the other methods, many assumptions must be done about
the model. This limits the accuracy of the results, as in many cases the
assumptions will not fully comply with the conditions in the real world
system.

71
72
Chapter 6

Conclusion
The concern of this thesis has been to present and compare dierent meth-
ods for computation of availability and production availability for a pro-
duction system assumed to handle a ow of some kind of uid. The same
methods might be applicable if the ow instead consists of some manner of
items being handled, but this must be considered in each individual case.
The thesis presents two software programs for reliability computations,
MIRIAM Regina, which is focused on computation of production availabil-
ity, and the RBD and OpSim modules of Relex Reliability Studio, which
among other things can be used to compute the availability of a system. In
addition, several dierent analytical methods for computation of availability
and production availability have been presented.
It has been shown that the results computed for a simple example are
quite similar for many of the methods presented. As the example is simple,
it is to be expected that even crude methods can obtain good results. For
the example used throughout this thesis, it is worth noting that many of
the results computed are within the condence intervals for the relevant
measure computed by simulation done by the softwares used in the thesis.
This supports the quality of the computed measures.
For availability computation, many methods can be employed, even
methods primarily made for computing production availability if they give
the amount of time with no ow through the system. This measure will be
comparable to the availability except in cases where availability computa-
tions are done in a manner so that ow can pass through the system when
the system is considered to be failed.
Availability is computed for the example system using several dierent
methods. As the example system is very simple, the assumptions implied
by the methods t the system well for all the methods in this thesis. For
the quasi renewal process, dierent assumptions must be made regarding
distribution of time to failure and repair, but otherwise the assumptions
are the same. The quasi renewal process is made to do computations on

73
systems with an increasing or decreasing failure or repair rate. The results
obtained from the quasi renewal process is thus quite dierent from the
results obtained with other methods.
The main source of error when modeling a system, is the assumptions
made during modeling. It is important when computations are made, and
especially when results are interpreted, that the assumptions made are kept
clear in mind. Calculations will only be accurate to the extent that the
assumptions match the real world situation for the system. This is espe-
cially important to remember when results are computed with the help of
software programs, where assumptions may not be clearly available unless
the documentation for the software is considered in detail. It is thus impor-
tant when using software programs, and especially when a person employs
a software program for the rst time, that time and eort is put into nding
the assumptions made by the software.
The same caution must be employed when using analytical methods.
The assumptions made are often more limiting for analytical computations
than for simulation.
Other than the assumptions regarding the model, the main sources of
uncertainty is uncertainty in the input data, and for simulation, the num-
ber of simulation iterations done. For some systems, the uncertainty due to
propagation of error through the method can be estimated and a condence
interval made based on these computations. In a simulation setting, con-
dence intervals will be made based on the results obtained in the dierent
simulation iterations. When doing simulation, propagation of uncertainty in
the input data comes in addition to the uncertainty due to simulation, and
methods exist for estimating the amount of uncertainty. However, these
methods must be implemented in the software used prior to simulation.
Otherwise, propagation of uncertainty cannot be estimated. As such meth-
ods are not implemented in MIRIAM or Relex, it is not possible to estimate
the propagation of uncertainty in the simulations done with these softwares.
To conclude, when doing availability or production availability calcula-
tions, it is important to choose a method which enables the model to be as
close to the real life system as possible. In most cases, this concern must be
weighed against the available time and resources for the computations. The
best choice of method will be the method which gives the most accuracy
within these bounds. In many cases, especially for complex systems, this
will be simulation. In many cases, analytical methods for availability calcu-
lation can be used to nd a quick but crude estimate of the availability of a
system. More complex analytical computations will primarily be of interest
in situations where the alternative is to implement the software for simu-
lation. If a software for sophisticated simulation of availability measures,
analytical calculations will often be more time consuming than simulation,
as both simulation and analytical calculations require a thorough under-

74
standing of the system on which calculations are to be done.

75
76
Bibliography
[1] Analytical methods versus simulation. Can be found at
http://www.relex.com/resources/art/art_rbd2.asp, last visited febru-
ary 22 2008.

[2] Help section of Relex Reliability Studio 2007. Help section of Relex
Reliability Studio 2007, included in the software.

[3] Miriam Regina User Manual , september 2006. Issue 2.0. Miriam Regina
version 1.03-build 209.

[4] Harold Ascher and Harry Feingold. Repairable systems reliability. Mod-

eling, inference, misconceptions and their causes. , volume 7 of Lecture

notes in statistics . Dekker, 1984.

[5] Aron Brall, William Hagen, and Hung Tran. Reliability block diagram
modeling - comparisons of three software packages. Reliability and
Maintainability Symposium, 2007. RAMS '07. Annual, pages 119124,

2007.

[6] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clif-


ford Stein. Introduction to Algorithms. The MIT Press, second edition,
2001.

[7] S. Dharmaraja and S.V. Amari. A method for exact MTBF evaluation
of repairable systems. Proceedings of the tenth international conference
on reliability and quality in design, 2004.

[8] Daniel Jacob and Suprasad V. Amari. Analysis of complex repairable


systems. Reliability and Maintainability Symposium, 2005. Proceedings.
Annual, pages 183189, 2005.

[9] Yoshio Kawauchi and Marvin Rausand. A new approach to production


regularity assessment in the oil and chemical industries. Reliability
engineering and system safety, (75):379388, 2002.

77
[10] M. Granger Morgan, Max Henrion, and Mitchell Small. Uncertainty. A

guide to dealing with uncertainty in qualitative risk and policy analysis.

Cambridge university press, 1990.

[11] Marvin Rausand and Arnljot Høyland. System Reliability Theory. Mod-
els, Statistical Methods, and Applications. Wiley Series in Probability

and Statistics. Wiley, second edition, 2004.

[12] Sheldon M. Ross. Introduction to probability models . Academic Press,


eighth edition, 2003.

[13] Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers, and Keying


Ye. Probability and statistics for engineers and scientists. Prentice
Hall, seventh edition, 2002.

[14] Hongzhou Wang and Hoang Pham. A quasi renewal process and its
applications in imperfect maintenance. International journal of systems
science, 27(10):10551062, 1996.

[15] Hongzhou Wang and Hoang Pham. Reliability and Optimal Mainte-
nance. Springer Series in Reliability Engineering. Springer, 2006.

[16] NORSOK Standard Z-016. Regularity Management


and Reliability Technology. Rev. 1, desember 1998.
http://www.standard.no/pronorm-3/data/f/0/01/50/5_10704_0/Z-
016.pdf, last vistited june 22 2008.

78

You might also like