MonteCarlo Simulation
MonteCarlo Simulation
MonteCarlo Simulation
All availability predictions involve uncertainty, regardless of method for assessment. Estimation of
the uncertainty related to the results, will give the results a more useful value for the decision
makers.
There exists several computer programs designed for availability or production availability
assesment. Two of these are MIRIAM Regina, distributed by CognIt, and Relex Architect,
distributed by Relex Scandinavia AB. There are also a number of statistical methods available,
including renewal and quasi-renewal processes.
iii
4 Software programs for availability calculations 49
4.1 The Relex software . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.1 The principles of Relex . . . . . . . . . . . . . . . . . 49
4.1.2 Implementation of models in Relex RBD or OpSim . 50
4.1.3 Calculations available in Relex RBD and OpSim . . 51
4.1.4 Availability calculations . . . . . . . . . . . . . . . . 52
4.1.5 Algorithm for Monte Carlo simulation . . . . . . . . 53
4.1.6 Method for calculating steady state availability, fail-
ure frequency and MTBF . . . . . . . . . . . . . . . 54
4.1.7 Capacity calculations . . . . . . . . . . . . . . . . . . 54
4.1.8 Computation of condence intervals . . . . . . . . . 54
4.1.9 Comparison of Relex and other reliability softwares . 55
4.1.10 The example in Relex . . . . . . . . . . . . . . . . . 55
4.2 Description of the reliability software MIRIAM Regina . . . 59
4.2.1 General principles . . . . . . . . . . . . . . . . . . . 59
4.2.2 Implementation of a system in MIRIAM Regina . . . 59
4.2.3 The simulation algorithm . . . . . . . . . . . . . . . 61
4.2.4 Calculation of production availability . . . . . . . . . 61
4.2.5 Available calculations . . . . . . . . . . . . . . . . . . 61
4.2.6 Uncertainty measures . . . . . . . . . . . . . . . . . 62
4.2.7 Implementation of the example in MIRIAM Regina . 63
6 Conclusion 73
References 78
iv
Chapter 1
Introduction
This thesis concerns dierent methods for calculating availability and pro-
duction availability for production systems. Both analytical methods and
softwares for simulation will be considered.
The system type considered in this thesis is a production system where a
ow of some kind enters the system at one point, goes through some process
and leaves the system at another point. The starting point for this thesis
was considerations for production systems in the petroleum industries, with
liquid ows of oil, gas or water. In many cases, similar considerations may be
used for other kinds of ow, for example parts being produced in a factory.
What kind of ow goes through the system is not considered in this thesis,
but the underlying assumption is that of a liquid ow. In the whole thesis,
the calculated production availabilities are calculated as percent of full ow.
The focus of the thesis is on the presentation and comparison of the dif-
ferent methods for computation of availability and production availability.
The background for the problem formulation is the fact that several dier-
ent softwares for availability and production availability calculations exist
and are employed in the petroleum industries. It is of interest to be able to
compare results obtained from dierent softwares, and in some cases to be
able to compute measurements which are not automatically computed by
the software.
The softwares considered in this thesis are MIRIAM Regina, a software
distributed by CognIT and widely used in Norway, and the software package
Relex Reliability Studio 2007, distributed by Relex Scandinavia AB. In the
thesis, the softwares will be referd to as respectively MIRIAM and Relex.
The experience of the author of this thesis with both these softwares is
limited, and getting to know especially Relex, has been a part of the task
in the preparation of this thesis. The presentation of the softwares is as far
as possible based on documentation from the distributors of the software.
In addition to the softwares, several analytical methods for computing
availability and production availability are presented, as it may be of interest
1
to compare the results of analytical computations with the results from the
softwares, which are primarily based on simulation. A part of the thesis is
also a method for estimating production availability from computations of
availability. This is of interest primarily in the case when one does not have
access to a software or method meant for computing production availability.
In some cases, it might be easier to apply this method instead of employing
time and eort in implementing the system in another software or using
another method. In such cases it must be taken into consideration whether
the estimates are suciently accurate for their uses.
1.1 Abbreviations
DFR Decreasing Failure Rate
DFRA Decreasing Failure Rate in Average
FMECA Failure Modes, Eects and Criticality Analysis
IFR Increasing Failure Rate
IFRA Increasing Failure Rate in Average
iid independent, identically distributed
LCL Lower condence limit
MDT Mean Downtime
MooN M out of N (components)
MTBF Mean time between failures
MTTF Mean Time To Failure
MTTR Mean Time To Repair
NBU New Better than Used
NBUE New Better than Used in Expectation
NWU New Worse than Used
NWUE New Worse than Used in Expectation
RBD Reliability block diagram
UCL Upper condence limit
1.2 Thanks
The problem formulation for this thesis was formulated in cooperation with
Safetec Nordic A/S, and all software used during the process was made
available by them.
I wish to thank all the sta at the Trondheim oce of Safetec, and
especially Marit Saastad and Ragnar Aarø, who have helped and supported
me in many ways during the writing of the thesis. I also wish to thank
Arvid Næss at NTNU.
2
Chapter 2
3
where X(t) is the structure function of the system in question. The struc-
ture function is such that
(
1 if the system is functioning
X(t) = (2.2)
0 otherwise
This expression for the availability gives the availability at a specic time.
However, in many cases it will not be possible to determine this function.
Three other ways of calculating availability are of interest. These are
• limiting availability
and can be regarded as the average amount of time the system is functioning
during the interval in question, [11].
The long run average availability is the limit of the interval availability
when the lower time limit is 0 and the upper time limit goes to innity, that
is
1 t
Z
Aav = lim A(t)dt (2.4)
t→∞ t 0
In some cases the availability of a system will reach a steady state after
the system has been in operation for a time. In those cases the limiting
availability is of use. It is dened as
if the limit exists. All these denitions can be found in [11]. The limit-
ing availability is of interest as for many systems, the availability quickly
converges toward the limiting availability.
Availability measures the up time of the system as a whole. If any ow
at all goes through the system, the system will be available. However, as
ow through the system may vary signicantly while the system is available,
other measures are needed to accurately estimate the produced amount.
4
2.1.2 Production availability
Production availability is a concept related to availability. It is dened in
[16] as
The ratio of production to planned production, or any other
reference level, over a specied period of time.
Note 1: This measure is used in connection with analysis of
delimited systems without compensating elements such as sub-
stitution from other producers and downstream buer storage.
Battery limits need to be dened in each case.
Note 2: The term injection availability may be used meaning
the ratio of injection volume to planned injection volume.
Expressed mathematically, this is
Pproduced
Aprod = (2.6)
Preference
Where Aprod denotes the production availability.
Production availability is found by looking at the availability of the
system together with the capacity of the stages in the system. The estimated
produced amount is then compared to the reference level, which often is the
wanted or ideal production.
5
2.1.4 Modeling of repair
When availability is to be calculated, repair must be modeled. The most
common method is to assume that repair is perfect, that is, a component
is as good as new after it is repaired. This is however not an assumption
that ts well with the real world. In reality, a component can be close to as
good as new, as bad as it was immediately before the failure and resulting
repair, or anywhere in between. A component can even be in worse state
after a repair, if mistakes were done by the person
In some cases, both repair and replacement of components can be done,
and dierent assumptions can be assigned to the dierent maintenance ac-
tions.
Imperfect repair can be modeled in dierent ways. The most common
is the (p, q) rule of imperfect repair.
where 0 ≤ x0 < x. Thus, drawing a number from the tail of the distribution
is the same as drawing from the distribution as a whole. This fact also shows
as the exponential distribution has constant failure rate, see [11]
6
utmost care. When interpreting results, the assumptions made must be
considered.
In [4], 18 factors which are often ignored in models of repairable systems
are listed. Some of these factors can be put into categories as given below.
Other factors are not mentioned here, as they primarily concern reliability
calculations. For example, there is little meaning in assuming negligeble
repair time when availability is to be calculated.
Independence of components
Perhaps the most common and most simplifying assumption made, is that
of independence of components. To a certain extent, this is a plausible as-
sumption, as each component in a system in many ways work on its own.
On the other hand, there are many circumstances which may cause simul-
taneous failure of many components in a system, when several components
depend on the same supporting system or are subject to the same envi-
ronmental stress. Examples can be a failure in the electrical system or a
lightning strike. These eects, known as common cause failures, are not
taken into account when components are assumed to be independent. Such
failures may also cause normally redundant components to fail simultane-
ously. Also, failure of one component may cause other components to fail.
Perfect repair
It is very common to assume that repair restores the system to a good as
new state, again an assumption which is seldom true in the real world. A
repair is usually performed by a person, who may make mistakes. A repair
on a real system may rend the system good as new, as bad as it was prior
to the repair or anywhere in between. It may even be worse than before the
repair, or even be broken, if mistakes were made.
A repair may be incomplete, may cause damage or improve the system
beyond what was expected. New parts may have been damaged before they
are put into use, and so the repair using such parts will not be perfect. Also,
parts may be replaced prior to failure if other repairs are done which make
additional repairs easy to do.
An important aspect is that a preventive maintenance action may inu-
ence the steady state results. If maintenance are a part of the model, using
steady state results to do calculations on the system may not be a good
approximation. It may also be that the steady state implies a reliability or
availability which is lower than the required level for the system, and the
system must then be overhauled before it reaches the steady state.
7
Human error not taken into account
Failures due to human error are very often ignored when system reliability
and availability are considered. This is at least partly due to the diculty
of assessing the probability of such errors. Human error may inuence the
day to day operation of the system, but is more likely to occur during repair
or maintenance, and especially in unusual operating conditions, such as a
when dangerous faults occur. This may inuence the extent of damage
when a failure occurs, and through this the time to repair.
8
of y with respect to x evaluated for a baseline value x0 . The variance of the
output can then be estimated as
2
∂y
Var(y) ≈ Var(x) (2.8)
∂x x0
Both analytical methods and methods for simulated models exist. For
analytical methods, one of the most common methods is Taylor series ap-
proximation.
Let X = [xi ]ni=1 be a vector of input values, and let y = f (X). Let the
baseline values be equal to the expectation of the input values, so X 0 =
[x0i ]ni=1 = E(X) and y 0 = f (X 0 ). Then the Taylor series expansion of the
distance between the output and the expected output is
n n n 2
X ∂y 1 XX ∂ y
y−y 0 0 = (xi −x0i ) + (xi −x0i )(xj −x0j ) +. . .
∂xi X 0 2 ∂xi ∂xj X 0
i=1 i=1 j=1
(2.9)
Close to X 0 the higher powers of the series will be small. One can then use
only two terms of the expansion and get a good approximation.
n
0
X ∂y
E(y − y ) ≈ E(xi − x0i )
∂xi X0
i=1
n Xn
∂2y
1 X
+ E[(xi − x0i )(xj − x0j )] (2.10)
2 ∂xi ∂xj X0
i=1 j=1
n X n
∂2y
1 X
= Covar(xi , xj )
2 ∂xi ∂xj X0
i=1 j=1
9
then compute an approximation to the variance of y − y0 :
Var(y) = E[(y − y 0 )2 ]
n !2
X ∂y
≈ E (xi − x0i )
∂xi X 0
i=1
n n
XX
0 0 ∂y ∂y
= E[(xi − xi )(xj − xj )]
∂xi X 0 ∂xj X 0
i=1 j=1
Xn X n
∂y
∂y
(2.11)
= Covar(xi , xj )
∂xi X 0 ∂xj X 0
i=1 j=1
n
∂y 2
X
= Var(xi )
∂xi X 0
i=1
n X n
X ∂y ∂y
+2 Covar(xi , xj )
∂xi X 0 ∂xj X 0
i=1 j=i+1
ȳ − µ
Z= (2.12)
√σ
n
10
and the width of the condence interval is found by subtracting the lower
limit from the upper limit. This gives
σ σ
W = ȳ + z 2 √ − ȳ − z 2 √
α α
n n
(2.14)
σ
= 2z α2 √
n
For a given width W , the equation can be solved for n. The result is
α 2 α 2
z2 z
n=4 2
σ = 4 2 Var(y) (2.15)
W W
There are also methods for computing uncertainty importance for Monte
Carlo models, but these need to be implemented in the simulation software,
as it is necessary to know all input values, and these are drawn as the
simulation progresses and usually not stored as this would require very
much storage capacity.
One possible measure is the sample correlation. Let m be the number
of runs, and let xk be a single input value in run number k and yk be the
corresponding output. The correlation between these can then be computed
as Pm
(xk − x̄)(yk − ȳ)
U (x, y) = pPm k1 Pm (2.16)
2 2
k=1 (xk − x̄) × k=1 (yk − ȳ)
This can be computed for each input value. This assumes that each simula-
tion run has the same number of input values, which is not the case for many
simulation methods. One could for example do the calculations for the rst
input drawn for each item being simulated. To do the calculations at all, it
will be necessary to implement the method in the simulation methodology.
If the method is not implemented when the simulation is done, it cannot
be employed.
11
The assumptions in an analytical model will be fashioned to come close
to a real life system, but will to a large degree depend on what can be
calculated. For large systems, analytical calculations can be prone to round
o error.
12
Chapter 3
is
n
Y
X= xi (3.2)
i=1
For a parallel structure, the system will function as long as one or more
components are functioning. The structure function is given by
n
Y
X =1− (1 − xi ) (3.3)
i=1
13
The structure function can also be found from the cut sets of a system.
A cut set is a set of components so that if all components of the cut set are
failed, the system has failed. For the system shown in gure 3.1, the cut
sets are {1}, {2, 3, 4}, {5}, {6, 7}. The structure function can then be made
by considering the components in the cut sets to be in parallel, and the cut
sets themselves to be in series, as all cut sets must function, but only one
component in each cut set is needed for the system to function.
Availability for the components can be calculated in many ways, the
most common of which is to compute it as
MTTF
ai = (3.4)
MTTF + MTTR
which gives the mean availability. However, many other methods may be
used.
When analytical methods are used, it is assumed that the result is accu-
rate. Usually, no condence interval or similar measure is computed. The
accuracy of the result will depend on the degree to which the assumptions
of the method agree with the real world situation. This may lead to grave
errors if not handled with care, as many assumptions made in analytical
methods are completely implausible in the real world, as discussed previ-
ously. Another source of error in analytical calculations is inaccuracy in the
input data. This has also been discussed in previous sections.
Most methods presented in this thesis assume independent components.
If this assumption is to be avoided, possible methods are Markov chains
and simulation. Later sections will present both of these options.
14
combining next event Monte Carlo simulation with a ow algorithm for
calculation of production availability. Simulation is more complex than an-
alytical calculation if the simulation algorithm must be implemented before
simulation can start. If a software with the simulation algorithm already
implemented is available, simulation may be less complex than analytical
calculations.
2
6
1 3 5
According to the article [9], one of the advantages of the analytical meth-
ods is ease of calculation, but the advantage of simulation is the possibility
of incorporating complex maintenance and repair strategies in the model.
The reason for using Markov modeling in the method proposed in the arti-
cle is to make it possible to incorporate such strategies into the analytical
15
Subsytem 2
Subsytem 4
2
Subsytem 1 Subsytem 3 6
1 3 5
Figure 3.2: The example system with subsystems shown as red frames
16
tributions, the rule is
X
P (Z = z) = P (X + Y = z) = P (X = x)P (Y = y) (3.7)
x,y;x+y=z
17
State No. Description Capacity
1 All components working 100
2 Comp. 2 and 3 working, 4 failed 100
3 Comp. 3 and 4 working, 2 failed 100
4 Comp. 2 and 4 working, 3 failed 90
5 Comp. 2 working, 3 and 4 failed 40
6 Comp. 3 working, 2 and 4 failed 60
7 Comp. 4 working, 2 and 3 failed 50
8 All failed 0
chains are employed. Basic Markov chain theory can be found for example
in [11] and [12]. The transition diagram of the system is shown in gure
3.3. The transition matrix of the Markov chain is then
α1 λ3 λ1 λ2 0 0 0 0
µ4 α2 0 0 λ3 λ2 0 0
µ2 0 α3 0 0 λ4 λ3 0
µ3 0 0 α4 λ4 0 λ2 0
(3.9)
0 µ3 0 µ4 α5 0 0 λ2
0 µ2 µ4 0 0 α6 0 λ3
0 0 µ3 µ2 0 0 α7 λ4
0 0 0 0 µ2 µ3 µ4 α8
Where
−(λ2 + λ3 + λ4 )
α1
α2
−(λ2 + λ3 + µ4 )
α3
−(λ3 + λ4 + µ2 )
α4 −(λ2 + λ4 + µ3 )
(3.10)
=
α5
−(µ3 + µ4 + λ2 )
α6
−(µ2 + µ4 + λ3 )
α7 −(µ2 + µ3 + λ4 )
α8 −(µ2 + µ3 + µ4 )
Subsystem number three again consists of only one component, and is
treated just like subsystem one. Subsystem 4 consists of two components
in parallel, and has four states, as described in table 3.4.
The transition diagram is shown in gure 3.5, and the transition matrix
for subsystem 4 is
1 − (λ6 + λ7 ) λ7 λ6 0
µ7 1 − (λ6 + µ7 ) 0 λ6 (3.11)
µ6 0 1 − (λ7 + µ6 ) λ7
0 µ6 µ7 1 − (µ6 + µ7 )
18
2 5
1 3 6 8
4 7
19
λ
20
1 2
3 4
21
State No. Description Capacity
1 Both components working 100
2 Comp. 6 working, 7 failed 50
3 Comp. 7 working, 6 failed 60
4 Both components failed 0
where Pij denotes the transition probability from state i to state j . These
equations give rise to a system of linear equations, which can easily be
solved by mathematical software equipped for matrix computations. The
limiting probabilities can be interpreted as the amount of time the system
spends in the respective states. For subsystem 4, the equation is
−(λ6 + λ7 ) µ7 µ6 0 π1 0
λ7 −(λ6 + µ7 ) 0 µ 6 π2
0
= (3.13)
λ6 0 −(λ7 + µ6 ) µ7 π3 0
1 1 1 1 π4 1
When numbers are inserted and the equation solved (using the statistical
software R which includes facilities for matrix computations), the solution
is found to be
9.7966 · 10−1
π1
π2 6.5441 · 10−3
π3 = 1.3702 · 10−2
(3.14)
π4 9.1527 · 10−5
For the other subsystems, the calculations are done in exactly the same
manner. Subsystem 1, consisting of only one component has the limiting
probabilities
π1 0.9940
= (3.15)
π2 0.0060
22
Subsystem 2 gives the limiting probabilities
9.758788 · 10−1
π1
π2 2.791013 · 10−3
1.139663 · 10−2
π3
9.758788 · 10−3
π4
(3.16)
π5 = 2.791013 · 10−5
π6
3.259435 · 10−5
π7 1.139663 · 10−4
π8 3.259435 · 10−7
and for subsystem 3, the result is
π1 0.9850
= (3.17)
π2 0.0150
With the results above, the probability distributions for the ow through
each subsystem is determined. Now the subsystems must be merged, using
the rules stated in section 3.3 above. First, subsystems one and two are
merged. The possible amounts of ow for subsystem 1 are 0% and 100%.
Subsystem two has the possibilities 0%, 40%, 50%, 60%, 90% and 100%.
(Subsystem two in reality has larger capacity, but will never recieve ow
greater than 100%, so this is ignored). The subsystems are connected in
series and have a nite number of possible amounts of ow, so the proba-
bility distribution of the ow is determined by equation (3.5). X is the ow
through the rst subsystem, Y is the ow through the second.
23
P (Z = 90) = P (X = 90)P (Y ≥ 90) + P (Y = 90)P (X > 90)
= 0 + 9.7588 · 10−3 · 0.9940 (3.22)
−3
= 9.7002 · 10
Table 3.5: The probatility distribution for the ow through the merging of
subsystem 1 and 2
and the result is given in table 3.6. At last, the two merged systems are
Table 3.6: The probability distribution for the ow through the merging of
subsystem 3 and 4
merged, and the result for the whole system is obtained. It is given in table
3.7.
The production availability of the system then becomes
Aproduction =40 · 2.7324 · 10−5 + 50 · 6.5180 · 10−3 + 60 · 1.3448 · 10−2 + 90 · 9.3607 · 10−3
+ 100 · 0.94969 = 96.94533
(3.24)
24
Amount of ow Probability
0 2.1000 · 10−2
40 2.7324 · 10−5
50 6.5180 · 10−3
60 1.3448 · 10−2
90 9.3607 · 10−3
100 0.94969
Table 3.7: The probability distribution for the ow through the whole sys-
tem.
If one compares this result to the result obtained in the section above,
the results are quite similar. The availability of the system can be computed
as the amount of time the ow is larger than 0, that is
3.4 %availability
An alternative and less accurate approach to estimating the produced amount
from a system is to compute the amount of time when a given percentage of
the demanded ow is available. This gives an idea of how much dierence
there is between the availability of a system and the production availability.
For systems with parallel structures, dierent congurations are possi-
ble. If a system has three parallel streams, one, two or all three may have
to work for the system to be functioning. Generally, such congurations are
denoted by the expression M ooN , interpreted as M out of N components
must function for the system to function. A system with tree components
in parallel can thus have the congurations 1oo3, 2oo3 or 3oo3. Availability
for the system can be computed for all such congurations.
If considering the system of the example above, shown in gure 3.1, with
data given in table 3.1, it is easily seen that to be sure of having 100 percent
ow passing through the system, all components must be operational. 90
percent ow is ensured if the rst parallel has conguration 2oo3 and the
second conguration 3oo3. If one has conguration 1oo3 and 1oo2, 40
percent ow is ensured, but 50 or 60 percent is more likely. These dierent
availabilities can be computed by applying the structure function of the
system. The structure function for the system with conguration 3oo3 and
2oo2 respectively for subsystem 2 and 4, is given by
A = a1 a2 a3 a4 a5 a6 a7 (3.26)
25
For conguration 2oo3 for subsystem 2 and 2oo2 for subsystem 4, the
availability is given by
26
is no way there can be ow going through the system when availability
calculations consider the system to be failed.
If a more accurate calculation is needed, the system must be considered
in more detail. Diculties arise when there are several alternate routes
for the production ow with dierent capacities. It is then necessary to
determine the average amount of time when each of these routes are being
used.
When computing the availability of the system with conguration 1oo3,
the result is the amount of time when 0, 1 or 2 streams are failed, that
is, 1, 2 or 3 streams are working. Using conguration 2oo3 gives the time
when 2 or 3 stream are working, and conguration 3oo3 gives the time
when 0 streams have failed, and consequently all streams are working. By
calculating all these availabilities and looking at the dierences between
them, one can nd the amounts of time when exactly 1, 2 or 3 streams are
working.
Assuming that the ratio of time for these states is constant whether the
system is working or not, one can nd the probability of 1, 2 or 3 streams
working given that the system is working. This is a natural assumption as
long as the components all have time measured as either calendar time or
running time.
For three components in parallel, denoted 1, 2 and 3, with availability
a1 , a2 and a3 , the relevant expressions can be found by rst determining
the structure function for each of the components and then inserting the
availability of the components instead of the state variables x1 , x2 and x3 .
The structure functions are:
Conguration 1oo3:
Conguration 2oo3:
X(t) = x1 x2 x3 (3.31)
27
3.5.1 Example
An easy example of the method described above. The system consists of
seven components, for which data is given in table 3.1. A diagram showing
the ow through the system is shown in gure 3.1. The capacities are
given in percent of required ow. It is assumed that all components are
independent. Further, it is assumed that the steady state availabilities can
be employed.
The steady state availability of the system is easily calculated from the
structure function together with the denition of the steady state availabil-
ity. If the second subsystem is assumed to have conguration 1oo3 and the
fourth subsystem has conguration 1oo2, the structure function is
When the steady state availabilities given in table 3.1 are put into this
equation, the result is A = 0.97899.
To compute the production availability of the system, the system is rst
decomposed into subsystems which are in series with each other. Thus,
there are 4 subsystems, the rst consisting of only component number 1,
the second of the three components 2, 3 and 4 which are in parallel with
each other, the third being only component number 5 and the last consisting
of components number 6 and 7, which again are in parallel. This is shown
in gure 3.2. When decomposing into such subsystems, it is important
that each component in the system only is part of one subsystem, as one
otherwise will have dependencies between the subsystems. In this case this
is no problem.
For the rst subsystem, calculating production availability is easy, as
the component will either let through 100% of the wanted production, or
none.
For the subsystem consisting of three components in parallel, the situa-
tion is that if all three components function, there is superuous capacity.
If only two streams are available, one may or may not have the demanded
capacity, depending on which streams are available. If only one stream is
available, how much capacity is available will also depend on which stream
is functioning.
To calculate the availability of the streams, rst calculate the availability
if the system has conguration 1oo3. This is given by
28
and for conguration 3oo3, which equals having a series system, as all com-
ponents must function for the system to function,
By looking at the dierences between these, one can nd the amount
of time when 1, 2 or 3 streams are available. Conguration 1oo3 gives the
amount of time when 1, 2 or 3 streams are available, conguration 2oo3
gives the availability of 2 or 3 streams and conguration 3oo3 gives the
availability of 3 streams. Thus 3 streams are available
that is they are available 2.4996% of the time, and only one stream is
available
A1oo3 − A2oo3 = 3.883 · 10−6 (3.38)
To decide the amount of time each stream or combination of streams
is available when the dierent streams have dierent availabilities, the pro-
portion of the availability of the relevant stream or combination of streams
to the availability of all the streams is computed. When one stream is
available, these proportions are
0.988
s1 = = 0.332
0.988 + 0.99 + 0.997
0.99
s2 = = 0.333 (3.39)
0.988 + 0.99 + 0.997
0.997
s3 = = 0.335
0.988 + 0.99 + 0.997
When two streams are working, the proportions are
0.988 · 0.99
s1,2 = = 0.3315
0.988 · 0.99 + 0.99 · 0.997 + 0.988 · 0.997
0.99 · 0.997
s2,3 = = 0.3345 (3.40)
0.988 · 0.99 + 0.99 · 0.997 + 0.988 · 0.997
0.988 · 0.997
s1,3 = = 0.334
0.988 · 0.99 + 0.99 · 0.997 + 0.988 · 0.997
The availability of the third subsystem, which consists of a single com-
ponent is given the data table 3.1. By calculating these numbers together,
one gets the result given in table 3.9. For example, 40% ow will be avail-
able if the rst and third subsystem is functioning, and only component
29
number 2 out of the tree components of subsystem number 2. The equation
is
A40 = a1 · a5 · a1oo3−2oo3 · s1
= 0.994 · 0.985 · 3.883 · 10−6 · 0.332 (3.41)
−5
= 1.328 · 10
A2oo2 = 0.979
A1oo2−2oo2 = 0.0209
(3.42)
s4 = 0.498
s5 = 0.502
Combined with the dierent amounts of ow available from upstream, the
result for availability of dierent amount of ow becomes as described in
table 3.10. When calculating these availabilities, it is important to remem-
ber that if 90% ow comes from upstream, it may result in 50%, 60% or
90% ow downstream, depending on which components are available in the
subsystem.
The average production availability of the system is then
30
Flow available Availability
40 1.3279 · 10−5
50 0.0102
60 0.0103
90 0.00785
100 0.951
The usual computation of the availability if one assumes that the rst
parallel has conguration 1oo3 and the second 1oo2, is
Multiplied with the full ow, one gets the result that the production avail-
ability is 93.5 percent. This is a conservative estimate as it only accounts
for the time when full ow can pass for the system. The actual production
availability will lie somewhere between these two bounds. For organisations
there may be much money saved by having more accurate estimates.
31
Where α is an input parameter. The derivation of the structure function
with respect to the dierent component values are given by
∂y
= (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − (1 − x6 )(1 − x7 ))
∂x1
∂y
= x1 x5 (x6 + x7 − x6 x7 )(1 − x3 − x4 + x3 x4 )
∂x2
∂y
= x1 x5 (x6 + x7 − x6 x7 )(1 − x2 − x4 + x2 x4 )
∂x3
∂y
= x1 x5 (x6 + x7 − x6 x7 )(1 − x2 − x3 + x2 x3 ) (3.47)
∂x4
∂y
= x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))(1 − (1 − x6 )(1 − x7 ))
∂x5
∂y
= x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − x7 )
∂x6
∂y
= x1 (1 − (1 − x2 )(1 − x3 )(1 − x4 ))x5 (1 − x6 )
∂x7
For the availability calculations given above, the input parameters were
MTTF and MTTR. If we let αi be the MTTF, and βi the MTTR, the
component availabilities which were put into the structure function are given
by
αi
ai = (3.48)
αi + β i
∂ai 1 αi
= −
∂αi αi + βi (αi + βi )2
(3.49)
∂ai αi
=−
∂βi (αi + βi )2
32
rameters. When values are inserted, the results are
∂y
= 0.98490
∂x1
∂y
= 2.9370 · 10−5
∂x2
∂y
= 3.5244 · 10−5
∂x3
∂y
= 1.1748 · 10−4 (3.50)
∂x4
∂y
= 0.99390
∂x5
∂y
= 0.0068536
∂x6
∂y
= 0.013707
∂x7
∂a1
= 1.1857 · 10−5
∂α1
∂a2
= 1.8999 · 10−5
∂α2
∂a3
= 2.4507 · 10−5
∂α3
∂a4
= 4.0584 · 10−6 (3.51)
∂α4
∂a5
= 2.4267 · 10−5
∂α5
∂a6
= 2.7232 · 10−5
∂α6
∂a7
= 1.0964 · 10−5
∂α7
33
and for the derivatives with respect to the MTTR, the results are
∂a1
= −0.0019762
∂β1
∂a2
= −0.0016284
∂β2
∂a3
= −0.0024507
∂β3
∂a4
= −0.0014204 (3.52)
∂β4
∂a5
= −0.0016178
∂β5
∂a6
= −0.0019452
∂β6
∂a7
= −0.0016447
∂β7
To nd the variance of y , equation (2.11) is employed. As the com-
ponents are assumed to be independent, only the variance terms of the
equation are dierent from zero, and the equation becomes
n n
∂y 2 ∂y 2
X X
Var(y) = Var(αi ) + Var(βi ) (3.53)
∂αi X 0 ∂βi X 0
i=1 i=1
The variances of the parameters are given in table 3.11. When the values
are put into the equation above, the result is
34
Component MTTF Var(MTTF) MTTR Var(MTTR)
1 500 25 3 0.5
2 600 22 7 0.7
3 400 16 4 0.3
4 700 30 2 0.2
5 600 18 9 0.9
6 500 15 7 0.5
7 600 20 4 0.8
Table 3.11: The variances of the MTTR and MTTF for the example data
and the renewal function W (t) and density w(t) = W 0 (t), where
35
to repair by another. In this situation, a renewal is considered to happen
when a component is returned to operating condition, that is,
Ti = Ui + Di (3.59)
T1 = Z1
T2 = αZ2
T3 = α2 Z3 (3.60)
....
..
Tn = αn−1 Zn
36
each repair. One may of course assume that the repair rate stays the same
by letting α = 1.
gives the Laplace transform of the availability. In principle, any failure and
repair distributions can be entered into this result to obtain an expression
for the availability. In practice, it will sometimes be very time consuming.
For the case of exponential time to failure and repair, the result can be
calculated relatively easily, as shown in [11]. Let the failure distribution be
fU (t) = λe−λt (3.64)
and let the repair distribution be
fD (t) = µe−µt (3.65)
The Laplace transforms of these are
λ
fU∗ (s) =
λ+s (3.66)
∗ µ
fD (s) =
µ+s
When putting this into (3.63) one gets
λ
1− λ+s
A∗ (s) = λ µ
s(1 − · µ+s )
λ+s (3.67)
µ 1 λ 1
= · + ·
λ + µ s λ + µ s + (λ + µ)
From tables of Laplace transforms one can nd that
−1 1
L =1 (3.68)
s
37
and
−1 1
L = eαt (3.69)
s−α
so when inverting A∗ (s) one gets
µ λ −(λ+µ)t
A(t) = ·1+ e (3.70)
λ+µ λ+µ
3.7.1 Example
The method above is used for the example system shown in gure 3.1 with
data as shown in table 3.1. Table 3.12 shows the availability functions for
the components in the example.
Table 3.12: The availability functions for the components in the example
Next, these functions must be inserted into the structure function to nd
the availability function for the whole system. However, it is probably easier
to insert the relevant time before the numbers are put into the structure
function, as one otherwise has a large amount of calculation to do. In this
case, it is wanted to calculate the availability at time t = 10000. When this
is inserted into the availability functions given in table 3.12, the results are
a1 (10000) = 0.99398
a2 (10000) = 0.98846
a3 (10000) = 0.99010
a4 (10000) = 0.99715 (3.71)
a5 (10000) = 0.98505
a6 (10000) = 0.98621
a7 (10000) = 0.99336
38
These numbers are of course approximate. It is likely that the availabilities
have converged or are close to converging to the limiting availability at time
t = 10000. When the numbers are inserted into the structure function of
the system, the resulting availability is
x P(X = x)
0 0.020969
40 1.17072 · 10−6
50 0.0098786
60 0.0099510
90 0.0077358
100 0.95147
Table 3.13: Probability distribution for the ow through the system calcu-
lated with a renewal process.
∂A
(3.74)
∂ai
∂ai
(3.75)
∂αi
where αi is an input parameter, are not the same. The derivatives in this
case are
∂ai µi 1 − tλi λi
=− + − e−(λi +µi )t
∂λi (λi + µi )2 λi + µi (λi + µi )2
(3.76)
∂ai 1 µi λi tλi −(λi +µi )t
= − − + e
∂µi λi + µi (λi + µi )2 (λi + µi )2 λi + µi
39
These derivatives are to be evaluated in the expected values of the pa-
rameters µi and λi . t is considered to be a constant, and is set to the value
t = 10000 as this is the value used in the calculations of the availability.
New values must also be calculated for the derivatives in equation (3.47).
The derivatives evaluated at the expected values are given in tables 3.14
and 3.15
Component
h i h i h i
∂ai ∂ai ∂A
∂λi X ∂µi X ∂ai X
0 0 0
1 −2, 9939033 0, 0181449 0, 98495948
2 −6, 8324911 0, 0797920 0, 0000276233
3 −3, 9211842 0, 0392118 0, 0000321993
4 −1, 9886089 0, 0056874 0, 0001118503
5 −8, 8210372 0, 1339194 0, 9938886620
6 −6, 8014269 0, 0951249 0, 0065013547
7 −3, 9470907 0, 0263666 0, 0135020604
Table 3.14: The derivatives evaluated at the expected value of the param-
eters.
Component
h i h i
∂A ∂A
∂λi ∂µi
1 −2.9489 0.017872
2 −1.8873 · 10−4 2.2041 · 10−6
3 −1.2626 · 10−4 1.2626 · 10−6
4 −2.2242 · 10−4 6.3614 · 10−7
5 −8.7671 0.13310
6 −0.044219 6.1845 · 10−4
7 −0.053294 3.5601 · 10−4
40
3.8 Quasi-renewal processes
For quasi-renewal processes, results for availability of several models are
shown in [15].
The model most similar to the case in question is the model described
in [15, Section 4.2.3]. The assumptions made for this model are:
• Upon failure it is imperfectly repaired, in the sense that the next time
to failure will be a fraction α of the previous time to failure
Where µ is expected life time and η is expected repair time of a new unit.
If maintenance is not taken into account, the availability is given by
µ(1−αk )
1−α
A(k; α, β) = µ(1−αk ) η(1−β k )
(3.79)
1−α + 1−β
This is found by using the principles for renewal-reward which can be found
in [11].
According to [14], the assumptions for this formula are that the planning
horizon is innite, the failure rate is continuous, monotonously increasing
and dierentiable, the cumulative distribution of the rst time to failure
is absolutely continuous, F (0) = 0 and the unit begins to operate at time
0. These assumptions are true for the Weibull distribution with shape
parameter 1.5.
41
For a such a distribution, the probability distribution is
1.5
f (t) = 1.5λ1.5 t0.5 e−(λt) (3.80)
3.9 Example
For a Weibull distribution with shape parameter 1.5, the probability distri-
bution is
1.5
f (t) = 1.5λ1.5 t0.5 e−(λt) (3.83)
and the cumulative distribution is
1.5
F (t) = 1 − e−(λt) (3.84)
42
Component k Availability
No 1 20 0.99050
No 2 17 0.98297
No 3 25 0.98229
No 4 15 0.99597
No 5 17 0.97820
No 6 20 0.97812
No 7 17 0.99019
And when the numbers for the availability is put into this, the resulting
availability for the system is
Next, the ratio between the availabilities for the dierent possibilities
for on line components when not all streams are working. For the second
subsystem, the possibilities are when one stream are working
a2
P (comp 2 working) = = 0.331947
a2 + a3 + a4
a3
P (comp 3 working) = = 0.331717 (3.88)
a2 + a3 + a4
a4
P (comp 4 working) = = 0.336337
a2 + a3 + a4
43
Subsystem Streams available Availability
1 1 0.99050
2 1 1.5 · 10−5
2 2 3.8314 · 10−2
2 3 0.96167
3 1 0.97820
4 1 3.127 · 10−2
4 2 0.96852
Table 3.18: The availability of the dierent number of streams for the sub-
systems.
For the forth subsystem, the ratios are as follows when one stream is
working:
a6
P (comp 6 working) = = 0.496934
a6 + a7
a7 (3.90)
P (comp 7 working) = = 0.503066
a6 + a7
From these numbers, the probability distributions for the dierent amounts
of ow through the subsystems can be set up. For subsystems 1 and 3, the
distributions are trivial, as they only have two possible states.
For subsystem number 2, the probability distribution is given in table
3.19, and for subsystem number 4, it is given in table 3.20. The merging
rules described in chapter 3.3 are then used to obtain the results for the
whole system. The resulting probability distribution for ow is given in
table 3.21.
The expected production availability is then
44
x P (X = x)
0 1 · 10−6
40 4.9792 · 10−6
50 5.0451 · 10−6
60 4.9758 · 10−6
90 1.2833 · 10−2
100 1.2657 · 10−2
110 1.2824 · 10−2
150 0.96167
Table 3.19: The probability distribution of the ow through the second
subsystem.
x P (X = x)
0 2.1 · 10−4
50 1.5539 · 10−2
60 1.5731 · 10−2
110 0.96852
Table 3.20: The probability distribution of the ow through the fourth
subsystem.
x P (X = x)
0 0.031297
40 4.8234 · 10−6
50 0.015060
60 0.015246
90 0.012043
100 0.92635
Table 3.21: The probability distribution for the ow through the system
45
interest are
1−αk λi (1−αk )2
∂ai 1−α (1−α)2
= λi (1−αk ) k) − 2 (3.92)
∂λi
1−α + µi (1−β
1−β
λi (1−αk )
+ µi (1−β k )
1−α 1−β
and
1−αk 1−β k
∂ai λi 1−α 1−β
= − 2 (3.93)
∂µi λi (1−αk ) µi (1−β k )
1−α + 1−β
The results when numbers are inserted are shown in tables 3.22 and 3.23
Component
h i h i h i
∂ai ∂ai ∂A
∂λi X ∂µi X ∂ai X
0 0 0
1 1, 8822521 −0, 0114076 0, 9779888477
2 4, 6346487 −0, 0541249 0, 0000691373
3 2, 1944249 −0, 0219442 0, 0000664827
4 1, 4081286 −0, 0040272 0, 0002921609
5 5, 9969422 −0, 0910445 0, 9902861927
6 4, 3010465 −0, 0601545 0, 0095049671
7 2, 6688100 −0, 0178277 0, 0211996616
Component
h i h i
∂A ∂A
∂λi ∂µi
1 1.8409 −0.011157
2 4.2042 · 10−4 −3.7420 · 10−6
3 1.4589 · 10−4 −1.4589 · 10−6
4 4.1140 · 10−4 −1.1766 · 10−6
5 5.9387 −0.090161
6 0.040881 −5.7177 · 10−4
7 0.056579 −3.7795 · 10−4
The variances are given in table 3.11. When numerical values are in-
serted, the variance becomes Var(y) = 719.64 and the standard deviation is
σ = 26.83. Also in this case the variance is large, resulting in a condence
46
interval of small interest, as it will include all possible values for the avail-
ability. Again, one must conclude that this method may not be preferrable
if other options are available.
47
48
Chapter 4
49
availability calculations, at least some measures may be calculated analyti-
cally also for repairable systems, as the example later in this chapter shows.
Due to the constraints on analytical methods, they will be used relatively
seldom.
When simulation is used, it is possible to calculate with repair resources,
costs, switch mechanisms, conditional repair policies and other factors com-
mon in a real world system.
For the purposes of this thesis, the systems will be modeled as relia-
bility block diagrams. This modeling methodology focuses on success of
the system, that is, there being an operational path through the system.
The method generally does not consider the ow through the system. In
Relex it is possible to assign capacities to the blocks of the RBD, and get
an estimate of the amount of ow through the system at dierent times.
Relex computes the dierent parameters of interest at a predetermined
number of points during the interval of the calculation. The number of
points is chosen by the user. This makes it possible to obtain results for the
development of the system during the simulated time period.
50
repair can be modeled by the (p, q)-rule.
Generally, very complex maintenance strategies can be modeled, and
one may come quite close to the strategies for a real life system.
Relex OpSim also contains the opportunity of optimizing the number
of spare parts available and maintenance intervals with respect to cost or
downtime of the system. If optimization is done, achieved availability can
be calculated.
Distribution
Exponential
Lognormal
Normal
Rayleigh
Time independent
Weibull
Uniform
Constant time
51
many points the calculation results will be shown for. For simulation one
decides how many replications are to be done and how many failures the
system must experience to reach the steady state. The random number seed
must also be specied. If two calculations are done with the same seed, the
simulation results will be identical.
It is not possible to specify sensitivity for the calculation, but the results
are shown with good sensitivity by default. Condence level for calculation
of condence intervals can be specied. Condence intervals can be calcu-
lated for reliability, availability, MTBF and MTTF.
If optimization is to be done, it is also specied in the calculation di-
alog. Optimization can be done for maintenance intervals and spare parts
with respect to cost or downtime. Costs can be specied extensively when
data is entered into the model, from cost of repairs, spare parts and other
resources to general downtime costs and costs for transporting and storing
components.
52
These are basically the same equations as the ones used when applying
the structure function to calculate availability, as described in chapter 3.1.
If the availability is calculated by simulation, the availability is calcu-
lated from the simulated failure and repair histories of the components.
6. Find the vector of system state changes, use it to calculate the required
measures for each simulation
53
4.1.6 Method for calculating steady state availability, fail-
ure frequency and MTBF
This method is described in [7], and in [8] it is stated that the method has
been implemented in Relex. According to [7], the method applies to general
failure distributions, and can be used with implicit decomposition methods.
The algorithm is as follows, when φi denotes the failure rate and ψi denotes
the repair rate of component i:
• Continue until the top level is reached. Then, the MTBF is given by
MTBF = ν1 .
54
for reliability and availability if repair is not taken into account. Otherwise,
the condence interval for availability assumingly is computed in some other
manner.
It is important to remember that Relex considers the availability of the
system as a whole, not of the components of the system. Downtime comes
only when a cut set of the system has failed, and ends as soon as the system
is up again.
For MTTF and MTBF, the condence intervals are computed by esti-
mating the mean and variance of the time to failure of the system from the
times computed at each simulation.
55
Figure 4.1: The model as shown when implemented in Relex
Measure Value
MTTF 266.772885
Steady state availability 0.979256
Availability (at time 10000) 0.979256
Mean availability (at time 10000) 0.979271
Failure frequency (at time 10000) 3626.067796
Capacity (at time 10000) 95.94
Mean capacity 96.140747
Table 4.2: The results obtained with analytical calculations performed with
Relex OpSim
Measure Value
MTTF 275.451966
Steady state availability 0.979206
Availability (at time 10000) 0.9808
Availability LCL 0.97811
Availability UCL 0.98349
Mean availability (at time 10000) 0.979207
Failure frequency (at time 10000) 3636.1
Capacity (at time 10000) 96.328333
Mean capacity 96.140972
Table 4.3: The results obtained with Monte Carlo simulation performed
with Relex OpSim
56
As can be seen, the steady state availability is very similar, and the
analytically calculated availability at time 10000 lies inside the condence
interval of the simulated availability. The mean availability is also very
similar, as is the mean capacity. The MTTF and failure frequencies are
however somewhat dierent.
To be able to estimate the production availability of the system, calcu-
lations were done with dierent congurations of the parallel subsystems.
For the above calculations, the congurations were respectively 1oo3 and
1oo2. In table 4.4, all the dierent results for the steady state and mean
availability are summed up. These numbers can be used to compute the
dierence in availability for dierent congurations, and this can again be
used to calculate an approximation of the production availability.
Table 4.4: Mean and steady state availability for dierent congurations of
the parallel subsystems
Second subsystem:
57
In the current situation, there are no calculations for the subsystems
alone, only for the system as a whole. For all calculations for the second
subsystem, the conguration for the fourth subsystem is set to be 1oo2.
Thus all the time the system is available and the current conguration of
the second subsystem is available is accounted for. As the subsystems are
connected in series, the time when not all subsystems are online is not in-
teresting. To nd the time when only one stream is online in the fourth
subsystems for each of the conguration of the second subsystem, it is as-
sumed that the percentage of time with only one stream available is the
same independent of the conguration of the second subsystem. As the
components are assumed to be independent, this is a logical assumption.
The percentage of time with only one stream available in the second sub-
system is
Table 4.5: The availability of dierent number of streams for the subsys-
tems, with the average capacity for the relevant number of streams.
58
The capacities for the dierent combinations of available streams are shown
in table 4.5. The production availability can from these numbers be esti-
mated to
100·(0.936365 + 0.022916) + 55 · (0.019325 + 4.7296 · 10−4 ) + 50 · (1.2443 · 10−4 + 2.5681 · 10−6 )
= 97.0233
(4.10)
59
the production availability will be less than 100% even if the system was
working perfectly during the whole simulation.
When implementing a network in MIRIAM Regina, one rst denes all
system stages. Each system stage can contain several items, but the ow
capacity is dened in the system stage. Failure and repair data is dened
in the items. A component in a system to be modeled will typically be
represented by an item, while a system stage will be a collection of related
components with the same ow capacity.
It is also possible to dene storage tanks in the system. These can
be dened to compensate for lack of upstream ow during downtime. If
a system has several discharge points, they can be dened to compensate
for each other at times when some discharge points achieve less than the
demanded ow.
As the ow through the system is the central object of modeling, there
are many opportunities for specifying properties of the ow. Several ows
can go through the same system. Flows can be split in system stages so that
two ows go out where only one went in (for example, in a seperator gas
and oil will be seperated into two dierent ows, and this can be modeled).
Quality constraints can also be dened. Proportional ow rules makes it
possible to dene relative amounts of ow through parallel stages.
Maintenance can be dened for the system. Several maintenance pro-
grams can be dened, and system stages choosen to be members of the
program.
For repair and maintenance, spare parts and other required resources can
be dened, with mobilization time for the resources. The number of avail-
able resources can be limited. Maintenance and repair can be implemented
to take place immediately or be postponed until a zero ow opportunity. A
maximal deferal time must then be stated. However, MIRIAM has no way
of determining the length of the zero ow opportunities in advance, so this
may also cause downtime for the streams.
It is also possible to dene objectives for the model. These are not ab-
solute requirements, but goals to be achieved if possible. For example, it
can be an objective that one specic discharge point should always recieve
as much ow as required. The objectives are dened in a prioritized list. If
objectives conict, the objective highest on the list is prioritized. Another
modeling feature is simulation rules, which can be used to model dependen-
cies. For example, it can be a rule that faliure of one item causes failure of
other items. If rules conict during simulation, this may cause unwanted
results, see [3].
When modeling a system in MIRIAM Regina, it is important to keep
the system as simple as possible to keep the simulation time down. Every
time a change happens in the system, the ow algorithm is called, and if
there are many limitations and objectives in the system, the ow algorithm
60
will be time consuming.
A useful feature in MIRIAM is the possibility of modeling auxilary sys-
tems, that is, systems which are not a part of the ow network, but on which
items in the ow network depend. Examples can be electric system or water
distribution. Items can be modeled to fail or have degraded operation when
the required amount is not produced by the auxilary system.
61
through the discharge points. The amount of time when more than 0%,
20%, 40%, 60%, 80% and 100% of system ow is available are also com-
puted automatically for each boundary point. From these, % availability
can easily be computed. Deliverability is computed in accordance with the
denition in NORSOK Z-016, given that all possibilities for substitution
from others and buer storage are modeled. Otherwise, these aspects of de-
liverability must be considered separately for the measure to comply with
the denition, see [3].
On stream availability is computed as the percentage of time the ow
through the system is larger than 0. This measure can be compared to the
availability calculations done with other methods.
In addition to the availability measures, it is possible to obtain numbers
for most of the features modeled in the system, such as amount of spare parts
used and to which extent rules have been fullled. A blaming algorithm is
also available. It assigns blame for downtime to the dierent components.
However, it can not fully be relied on as in some cases, the wrong component
may be blamed for the downtime, see [3].
62
4.2.7 Implementation of the example in MIRIAM Regina
The example used in previous sections was implemented in MIRIAM Regina.
Figure 4.2 shows the ow diagram as it is shown in MIRIAM.
comp2
comp6
comp7
comp4
Data was as shown in table 3.1, and exponential distribution was as-
sumed both for the failure and repair time.
Simulation was done over 10000 system hours. 1000 iterations were
performed. The results are shown in tables 4.6 and 4.7.
Measure Value
Production availability 96.95%
On stream availability 97.90%
Deliverability 96.95%
Demand availability 94.99%
63
to production availability, as neither storage units nor substitution is mod-
eled. For the production availability, the variance can also be calculated
from the results from each simulation, by applying the standard formula for
estimating variance,
n
1 X
Var(y) = (y − ȳ)2 (4.11)
n−1
i=1
64
Chapter 5
65
where the focus is on system ow, the system is modeled in two stages.
First the ow network is implemented. The ow network consists of system
stages in which the types of ow and capacity is decided. Each system stage
can then have a number of items, which are the components of the system.
Each item has separate failure and repair data. Maintenance is however
implemented for system stages.
In Relex, each component must be modeled as a separate item in the sys-
tem. All data must be entered for each component, including maintenance
data.
66
5.1.3 Uncertainty in the computations
When simulation is used, the uncertainty in both MIRIAM and Relex will
primarily depend on the number of simulations done, in addition to the
simplications made when implementing the model. One simulation run
does however not necessarily mean the same in MIRIAM and in Relex. In
MIRIAM, a simulation run consists of simulating the entire period given in
the simulation parameters, regardless of the performance of the system. In
Relex, it depends on which module is applied. In OpSim, the simulation
methodology resembles that of MIRIAM, in the RBD module, a simulation
iteration will end when the system as a whole has failed, according to the
help section of the program [2]. The computation of condence intervals in
Relex is not well documented except in a few special cases. It must be as-
sumed that valid statistical methods are employed also when documentation
is lacking.
In Relex, the output consists only of the end results with condence
intervals, but it is possible to have results computed at many points of the
simulated time. In MIRIAM, one can obtain some results from each simula-
tion iteration, but only mean measures over the entire simulated time. Stan-
dard deviation is computed for the deliverability, but condence intervals
must be calculated from this. One can also take advantage of availability
of the results from each simulation iteration to compute variance.
Relex also provides the possibility of applying analytical computations
when the system makes this possible. This is an advantage, as analytical
computations do not depend on the number of simulations, which is often,
at least for complex systems, decided at least in part by the time available.
However, for complex systems analytical calculations are often unfeasible.
Whether analytical calculation is possible also depends on which calcula-
tions are to be done. Simulation can be forced even in the cases where the
software would choose analytical computation.
When it comes to the model assumptions, the programs are quite sim-
ilar. By default, it is in both programs assumed that the components are
independent. In MIRIAM there are several ways of modeling dependencies
between components, but these will rapidly increase the simulation time.
In Relex, there are no such features. This is an advantage for MIRIAM. In
many ways, MIRIAM is more exible when it comes to assumptions on the
ow system modeled, while Relex is more exible when modeling repair and
maintenance. Again, the choice of software must depend on which features
are considered most important. Also for repair and maintenance modeling,
Relex and MIRIAM oers many similar features, but all in all there are
more pos
All in all, which software is preferable depends on the computations
to be done and which system features are considered most important. If
67
system ow and production availability is the important issue, MIRIAM
Regina would be the best software. If measures like cost and general system
availability are the important points, Relex oers more opportunities.
68
stresses from their environment.
When it comes to eort of calculation, the easiest alternatives in most
cases will be the ordinary calculation of availability, or the method from
[9] for computation of production availability. Computation of produc-
tion availability by employing the availability calculations done with other
methods is less accurate and more complicated. If the model assumptions
indicate that other methods should be used for availability calculations, the
amount of computation is not infeasible. For large systems, any analytical
computation would be time consuming.
The advantage of using analytical calculations over simulation is that the
accuracy of the results does not depend on the available time for simulation.
However, as the assumptions that have to be made about the system when
analytical calculations are to be done can be very limiting, the divergence
between the model and the real world system may become large.
Table 5.1: Comparision of the results for availability from the dierent
methods
The dierent computed values for the system availability are given in ta-
ble 5.1. All the values for computed availability are the same when rounded
to three decimal places, A = 0.979. The only exception is the quasi-renewal.
This was expected, as this measure assumes that the system is deteriorating
with each failure, while all the others assume a repair restores the system
to good as new status.
The condence intervals computed are based on dierent concepts. The
condence interval for the simulated availability from Relex is computed
from the dierent results obtained in the simulation iterations. The con-
dence intervals are based on the uncertainty in the input parameters. What
is shown by these condence intervals is that for renewal and quasi renewal
calculations, the uncertainty in the input parameters has much larger in-
uence than for the other methods. For the ordinary method of analytical
69
computation of availability, the condence interwall becomes very narrow,
which implies that uncertainty in the input parameters does not aect the
result very much.
The values for production availability are given in table 5.2. These mea-
sures vary somewhat more than the values for the availability. As MIRIAM
Regina and the method presented in section 3.3 are the only methods specif-
ically meant for computing production availability, these measures are con-
sidered to be the most accurate. They are also very similar. When rounded
to two decimal places, the results are the same, Aprod = 96.95. The methods
based on the availability for the single components underestimate this mea-
sure somewhat, and the method based on the simulated availability results
overestimate it slightly. Again, quasi-renewal gives a lower measure than
the other methods, which is to be expected. It is worth notice that the ca-
pacity calculation done by Relex underestimates the production availability,
the only method giving a smaller production availability is the calculations
based on the quasi renewal process. This may be due to the way in which
the ow is delegated through ow networks in the computations, see section
4.1.2.
The only method for which condence intervals can be computed for
production availability is the simulation done by MIRIAM. This is due to
the manner in which the computations are done in the other cases. In
Relex, one would assume that it would be possible to estimate a condence
interval for the capacity calculation if methods for such were implemented.
As they are not, and the results for the dierent simulation iterations are
not available in the software program, this cannot be done.
Propagation of uncertainty if also dicult for the analytical method
used for computing production availability. For the availability measures
employed to compute the probability distributions they might have been
calculated, but to propagate these through the capacity calculations is not
trivial.
It is worth noting that for all the analytical methods except quasi re-
newal, the estimated production availability lies within the condence in-
terval for the production availability computed by MIRIAM.
The system considered in this thesis was a very simple one. Due to
this even crude methods must be expected to give relatively good results.
For a more complicated system, the amount of work involved in computing
the production availability from methods meant to compute availability will
quickly become unmanagable. The results will probably deteriorate as the
system compexity grows, as there will be more inaccuracies.
The dierence between the methods for computation of production avail-
ability is the manner of computing the probability distribution for the ow.
Markov chains give, as stated in [9], opportunity for implementing many
features, by making the Markov chains more complex. The limitations of
70
Method Production availability Condence interval
Relex simulated 97.0233 -
Relex capacity 96.14075 -
MIRIAM 96.95 [0.96912, 0.96982]
Prod. av from av 96.935 -
Prod. av 96.94533 -
Renewal 96.9343 -
Quasi-renewal 95.3868 -
Table 5.2: Comparision of the results for production availability from the
dierent methods
this method are primarily due to the available time and resources for the
calculation. For the other methods, many assumptions must be done about
the model. This limits the accuracy of the results, as in many cases the
assumptions will not fully comply with the conditions in the real world
system.
71
72
Chapter 6
Conclusion
The concern of this thesis has been to present and compare dierent meth-
ods for computation of availability and production availability for a pro-
duction system assumed to handle a ow of some kind of uid. The same
methods might be applicable if the ow instead consists of some manner of
items being handled, but this must be considered in each individual case.
The thesis presents two software programs for reliability computations,
MIRIAM Regina, which is focused on computation of production availabil-
ity, and the RBD and OpSim modules of Relex Reliability Studio, which
among other things can be used to compute the availability of a system. In
addition, several dierent analytical methods for computation of availability
and production availability have been presented.
It has been shown that the results computed for a simple example are
quite similar for many of the methods presented. As the example is simple,
it is to be expected that even crude methods can obtain good results. For
the example used throughout this thesis, it is worth noting that many of
the results computed are within the condence intervals for the relevant
measure computed by simulation done by the softwares used in the thesis.
This supports the quality of the computed measures.
For availability computation, many methods can be employed, even
methods primarily made for computing production availability if they give
the amount of time with no ow through the system. This measure will be
comparable to the availability except in cases where availability computa-
tions are done in a manner so that ow can pass through the system when
the system is considered to be failed.
Availability is computed for the example system using several dierent
methods. As the example system is very simple, the assumptions implied
by the methods t the system well for all the methods in this thesis. For
the quasi renewal process, dierent assumptions must be made regarding
distribution of time to failure and repair, but otherwise the assumptions
are the same. The quasi renewal process is made to do computations on
73
systems with an increasing or decreasing failure or repair rate. The results
obtained from the quasi renewal process is thus quite dierent from the
results obtained with other methods.
The main source of error when modeling a system, is the assumptions
made during modeling. It is important when computations are made, and
especially when results are interpreted, that the assumptions made are kept
clear in mind. Calculations will only be accurate to the extent that the
assumptions match the real world situation for the system. This is espe-
cially important to remember when results are computed with the help of
software programs, where assumptions may not be clearly available unless
the documentation for the software is considered in detail. It is thus impor-
tant when using software programs, and especially when a person employs
a software program for the rst time, that time and eort is put into nding
the assumptions made by the software.
The same caution must be employed when using analytical methods.
The assumptions made are often more limiting for analytical computations
than for simulation.
Other than the assumptions regarding the model, the main sources of
uncertainty is uncertainty in the input data, and for simulation, the num-
ber of simulation iterations done. For some systems, the uncertainty due to
propagation of error through the method can be estimated and a condence
interval made based on these computations. In a simulation setting, con-
dence intervals will be made based on the results obtained in the dierent
simulation iterations. When doing simulation, propagation of uncertainty in
the input data comes in addition to the uncertainty due to simulation, and
methods exist for estimating the amount of uncertainty. However, these
methods must be implemented in the software used prior to simulation.
Otherwise, propagation of uncertainty cannot be estimated. As such meth-
ods are not implemented in MIRIAM or Relex, it is not possible to estimate
the propagation of uncertainty in the simulations done with these softwares.
To conclude, when doing availability or production availability calcula-
tions, it is important to choose a method which enables the model to be as
close to the real life system as possible. In most cases, this concern must be
weighed against the available time and resources for the computations. The
best choice of method will be the method which gives the most accuracy
within these bounds. In many cases, especially for complex systems, this
will be simulation. In many cases, analytical methods for availability calcu-
lation can be used to nd a quick but crude estimate of the availability of a
system. More complex analytical computations will primarily be of interest
in situations where the alternative is to implement the software for simu-
lation. If a software for sophisticated simulation of availability measures,
analytical calculations will often be more time consuming than simulation,
as both simulation and analytical calculations require a thorough under-
74
standing of the system on which calculations are to be done.
75
76
Bibliography
[1] Analytical methods versus simulation. Can be found at
http://www.relex.com/resources/art/art_rbd2.asp, last visited febru-
ary 22 2008.
[2] Help section of Relex Reliability Studio 2007. Help section of Relex
Reliability Studio 2007, included in the software.
[3] Miriam Regina User Manual , september 2006. Issue 2.0. Miriam Regina
version 1.03-build 209.
[4] Harold Ascher and Harry Feingold. Repairable systems reliability. Mod-
[5] Aron Brall, William Hagen, and Hung Tran. Reliability block diagram
modeling - comparisons of three software packages. Reliability and
Maintainability Symposium, 2007. RAMS '07. Annual, pages 119124,
2007.
[7] S. Dharmaraja and S.V. Amari. A method for exact MTBF evaluation
of repairable systems. Proceedings of the tenth international conference
on reliability and quality in design, 2004.
77
[10] M. Granger Morgan, Max Henrion, and Mitchell Small. Uncertainty. A
[11] Marvin Rausand and Arnljot Høyland. System Reliability Theory. Mod-
els, Statistical Methods, and Applications. Wiley Series in Probability
[14] Hongzhou Wang and Hoang Pham. A quasi renewal process and its
applications in imperfect maintenance. International journal of systems
science, 27(10):10551062, 1996.
[15] Hongzhou Wang and Hoang Pham. Reliability and Optimal Mainte-
nance. Springer Series in Reliability Engineering. Springer, 2006.
78