Life Data Analysis Reference
Life Data Analysis Reference
Life Data Analysis Reference
ReliaSoft Corporation
Worldwide Headquarters
1450 South Eastside Loop
Tucson, Arizona 85710-6703, USA
http://www.ReliaSoft.com
Notice of Rights: The content is the Property and Copyright of ReliaSoft Corporation, Tucson,
Arizona, USA. It is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0
International License. See the next pages for a complete legal description of the license or go to
http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.
Generation Date: This document was generated on May 22, 2015 based on the current state of the
online reference book posted on ReliaWiki.org. Information in this document is subject to change
without notice and does not represent a commitment on the part of ReliaSoft Corporation. The
content in the online reference book posted on ReliaWiki.org may be more up-to-date.
Disclaimer: Companies, names and data used herein are fictitious unless otherwise noted. This
documentation and ReliaSoft’s software tools were developed at private expense; no portion was
developed with U.S. government funds.
Trademarks: ReliaSoft, Synthesis Platform, Weibull++, ALTA, DOE++, RGA, BlockSim, RENO, Lambda
Predict, Xfmea, RCM++ and XFRACAS are trademarks of ReliaSoft Corporation.
Other product names and services identified in this document are trademarks of their respective
trademark holders, and are used for illustration purposes. Their use in no way conveys endorsement
or other affiliation with ReliaSoft Corporation.
Attribution-NonCommercial-ShareAlike 4.0 International
License Agreement
By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms
and conditions of this Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
Public License ("Public License"). To the extent this Public License may be interpreted as a contract,
You are granted the Licensed Rights in consideration of Your acceptance of these terms and
conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives
from making the Licensed Material available under these terms and conditions.
Section 1 – Definitions.
a. Adapted Material means material subject to Copyright and Similar Rights that is derived
from or based upon the Licensed Material and in which the Licensed Material is translated,
altered, arranged, transformed, or otherwise modified in a manner requiring permission
under the Copyright and Similar Rights held by the Licensor. For purposes of this Public
License, where the Licensed Material is a musical work, performance, or sound recording,
Adapted Material is always produced where the Licensed Material is synched in timed
relation with a moving image.
b. Adapter's License means the license You apply to Your Copyright and Similar Rights in Your
contributions to Adapted Material in accordance with the terms and conditions of this Public
License.
c. BY-NC-SA Compatible License means a license listed at
creativecommons.org/compatiblelicenses, approved by Creative Commons as essentially the
equivalent of this Public License.
d. Copyright and Similar Rights means copyright and/or similar rights closely related to
copyright including, without limitation, performance, broadcast, sound recording, and Sui
Generis Database Rights, without regard to how the rights are labeled or categorized. For
purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright
and Similar Rights.
e. Effective Technological Measures means those measures that, in the absence of proper
authority, may not be circumvented under laws fulfilling obligations under Article 11 of the
WIPO Copyright Treaty adopted on December 20, 1996, and/or similar international
agreements.
f. Exceptions and Limitations means fair use, fair dealing, and/or any other exception or
limitation to Copyright and Similar Rights that applies to Your use of the Licensed Material.
g. License Elements means the license attributes listed in the name of a Creative Commons
Public License. The License Elements of this Public License are Attribution, NonCommercial,
and ShareAlike.
h. Licensed Material means the artistic or literary work, database, or other material to which
the Licensor applied this Public License.
i. Licensed Rights means the rights granted to You subject to the terms and conditions of this
Public License, which are limited to all Copyright and Similar Rights that apply to Your use of
the Licensed Material and that the Licensor has authority to license.
j. Licensor means ReliaSoft Corporation, 1450 Eastside Loop, Tucson, AZ 85710.
k. NonCommercial means not primarily intended for or directed towards commercial
advantage or monetary compensation. For purposes of this Public License, the exchange of
the Licensed Material for other material subject to Copyright and Similar Rights by digital
file-sharing or similar means is NonCommercial provided there is no payment of monetary
compensation in connection with the exchange.
l. Share means to provide material to the public by any means or process that requires
permission under the Licensed Rights, such as reproduction, public display, public
performance, distribution, dissemination, communication, or importation, and to make
material available to the public including in ways that members of the public may access the
material from a place and at a time individually chosen by them.
m. Sui Generis Database Rights means rights other than copyright resulting from Directive
96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal
protection of databases, as amended and/or succeeded, as well as other essentially
equivalent rights anywhere in the world.
n. You means the individual or entity exercising the Licensed Rights under this Public License.
Your has a corresponding meaning.
Section 2 – Scope.
a. License grant.
1. Subject to the terms and conditions of this Public License, the Licensor hereby grants
You a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license
to exercise the Licensed Rights in the Licensed Material to:
A. reproduce and Share the Licensed Material, in whole or in part, for
NonCommercial purposes only; and
B. produce, reproduce, and Share Adapted Material for NonCommercial
purposes only.
2. Exceptions and Limitations. For the avoidance of doubt, where Exceptions and
Limitations apply to Your use, this Public License does not apply, and You do not
need to comply with its terms and conditions.
3. Term. The term of this Public License is specified in Section 6(a).
4. Media and formats; technical modifications allowed. The Licensor authorizes You to
exercise the Licensed Rights in all media and formats whether now known or
hereafter created, and to make technical modifications necessary to do so. The
Licensor waives and/or agrees not to assert any right or authority to forbid You from
making technical modifications necessary to exercise the Licensed Rights, including
technical modifications necessary to circumvent Effective Technological Measures.
For purposes of this Public License, simply making modifications authorized by this
Section 2(a)(4) never produces Adapted Material.
5. Downstream recipients.
A. Offer from the Licensor – Licensed Material. Every recipient of the Licensed
Material automatically receives an offer from the Licensor to exercise the
Licensed Rights under the terms and conditions of this Public License.
B. Additional offer from the Licensor – Adapted Material. Every recipient of
Adapted Material from You automatically receives an offer from the
Licensor to exercise the Licensed Rights in the Adapted Material under the
conditions of the Adapter’s License You apply.
C. No downstream restrictions. You may not offer or impose any additional or
different terms or conditions on, or apply any Effective Technological
Measures to, the Licensed Material if doing so restricts exercise of the
Licensed Rights by any recipient of the Licensed Material.
6. No endorsement. Nothing in this Public License constitutes or may be construed as
permission to assert or imply that You are, or that Your use of the Licensed Material
is, connected with, or sponsored, endorsed, or granted official status by, the
Licensor or others designated to receive attribution as provided in Section
3(a)(1)(A)(i).
b. Other rights.
1. Moral rights, such as the right of integrity, are not licensed under this Public License,
nor are publicity, privacy, and/or other similar personality rights; however, to the
extent possible, the Licensor waives and/or agrees not to assert any such rights held
by the Licensor to the limited extent necessary to allow You to exercise the Licensed
Rights, but not otherwise.
2. Patent and trademark rights are not licensed under this Public License.
3. To the extent possible, the Licensor waives any right to collect royalties from You for
the exercise of the Licensed Rights, whether directly or through a collecting society
under any voluntary or waivable statutory or compulsory licensing scheme. In all
other cases the Licensor expressly reserves any right to collect such royalties,
including when the Licensed Material is used other than for NonCommercial
purposes.
Your exercise of the Licensed Rights is expressly made subject to the following conditions.
a. Attribution.
1. If You Share the Licensed Material (including in modified form), You must:
A. retain the following if it is supplied by the Licensor with the Licensed
Material:
i. identification of the creator(s) of the Licensed Material and any
others designated to receive attribution, in any reasonable manner
requested by the Licensor (including by pseudonym if designated);
ii. a copyright notice;
iii. a notice that refers to this Public License;
iv. a notice that refers to the disclaimer of warranties;
v. a URI or hyperlink to the Licensed Material to the extent reasonably
practicable;
B. indicate if You modified the Licensed Material and retain an indication of
any previous modifications; and
C. indicate the Licensed Material is licensed under this Public License, and
include the text of, or the URI or hyperlink to, this Public License.
2. You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on
the medium, means, and context in which You Share the Licensed Material. For
example, it may be reasonable to satisfy the conditions by providing a URI or
hyperlink to a resource that includes the required information.
3. If requested by the Licensor, You must remove any of the information required by
Section 3(a)(1)(A) to the extent reasonably practicable.
b. ShareAlike.
In addition to the conditions in Section 3(a), if You Share Adapted Material You produce, the
following conditions also apply.
1. The Adapter’s License You apply must be a Creative Commons license with the same
License Elements, this version or later, or a BY-NC-SA Compatible License.
2. You must include the text of, or the URI or hyperlink to, the Adapter's License You
apply. You may satisfy this condition in any reasonable manner based on the
medium, means, and context in which You Share Adapted Material.
3. You may not offer or impose any additional or different terms or conditions on, or
apply any Effective Technological Measures to, Adapted Material that restrict
exercise of the rights granted under the Adapter's License You apply.
Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the
Licensed Material:
a. for the avoidance of doubt, Section 2(a)(1) grants You the right to extract, reuse, reproduce,
and Share all or a substantial portion of the contents of the database for NonCommercial
purposes only;
b. if You include all or a substantial portion of the database contents in a database in which You
have Sui Generis Database Rights, then the database in which You have Sui Generis Database
Rights (but not its individual contents) is Adapted Material, including for purposes of Section
3(b); and
c. You must comply with the conditions in Section 3(a) if You Share all or a substantial portion
of the contents of the database.
For the avoidance of doubt, this Section 4 supplements and does not replace Your obligations under
this Public License where the Licensed Rights include other Copyright and Similar Rights.
a. Unless otherwise separately undertaken by the Licensor, to the extent possible, the
Licensor offers the Licensed Material as-is and as-available, and makes no representations
or warranties of any kind concerning the Licensed Material, whether express, implied,
statutory, or other. This includes, without limitation, warranties of title, merchantability,
fitness for a particular purpose, non-infringement, absence of latent or other defects,
accuracy, or the presence or absence of errors, whether or not known or discoverable.
Where disclaimers of warranties are not allowed in full or in part, this disclaimer may not
apply to You.
b. To the extent possible, in no event will the Licensor be liable to You on any legal theory
(including, without limitation, negligence) or otherwise for any direct, special, indirect,
incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or
damages arising out of this Public License or use of the Licensed Material, even if the
Licensor has been advised of the possibility of such losses, costs, expenses, or damages.
Where a limitation of liability is not allowed in full or in part, this limitation may not apply
to You.
c. The disclaimer of warranties and limitation of liability provided above shall be interpreted in
a manner that, to the extent possible, most closely approximates an absolute disclaimer and
waiver of all liability.
a. This Public License applies for the term of the Copyright and Similar Rights licensed here.
However, if You fail to comply with this Public License, then Your rights under this Public
License terminate automatically.
b. Where Your right to use the Licensed Material has terminated under Section 6(a), it
reinstates:
1. automatically as of the date the violation is cured, provided it is cured within 30 days
of Your discovery of the violation; or
2. upon express reinstatement by the Licensor.
For the avoidance of doubt, this Section 6(b) does not affect any right the Licensor may have
to seek remedies for Your violations of this Public License.
c. For the avoidance of doubt, the Licensor may also offer the Licensed Material under
separate terms or conditions or stop distributing the Licensed Material at any time;
however, doing so will not terminate this Public License.
d. Sections 1, 5, 6, 7, and 8 survive termination of this Public License.
a. The Licensor shall not be bound by any additional or different terms or conditions
communicated by You unless expressly agreed.
b. Any arrangements, understandings, or agreements regarding the Licensed Material not
stated herein are separate from and independent of the terms and conditions of this Public
License.
Section 8 – Interpretation.
a. For the avoidance of doubt, this Public License does not, and shall not be interpreted to,
reduce, limit, restrict, or impose conditions on any use of the Licensed Material that could
lawfully be made without permission under this Public License.
b. To the extent possible, if any provision of this Public License is deemed unenforceable, it
shall be automatically reformed to the minimum extent necessary to make it enforceable. If
the provision cannot be reformed, it shall be severed from this Public License without
affecting the enforceability of the remaining terms and conditions.
c. No term or condition of this Public License will be waived and no failure to comply
consented to unless expressly agreed to by the Licensor.
d. Nothing in this Public License constitutes or may be interpreted as a limitation upon,
or waiver of, any privileges and immunities that apply to the Licensor or You,
including from the legal processes of any jurisdiction or authority.
Contents
Chapter 1 1
Introduction to Life Data Analysis 1
Chapter 2 10
Basic Statistical Background 10
Chapter 3 16
Life Distributions 16
Chapter 4 23
Parameter Estimation 23
Chapter 5 41
Life Data Classification 41
Chapter 6 48
Confidence Bounds 48
Chapter 7 71
The Exponential Distribution 71
Chapter 8 103
The Weibull Distribution 103
Chapter 9 155
The Normal Distribution 155
Chapter 10 187
The Lognormal Distribution 187
Chapter 11 214
The Mixed Weibull Distribution 214
Chapter 12 225
The Generalized Gamma Distribution 225
Chapter 13 231
The Gamma Distribution 231
Chapter 14 237
The Logistic Distribution 237
Chapter 15 246
The Loglogistic Distribution 246
Chapter 16 251
The Gumbel/SEV Distribution 251
Chapter 17 257
Non-Parametric Life Data Analysis 257
Chapter 18 264
Competing Failure Modes Analysis 264
Chapter 19 274
Warranty Data Analysis 274
Chapter 20 305
Recurrent Event Data Analysis 305
Chapter 21 319
Degradation Data Analysis 319
Chapter 22 329
Reliability Test Design 329
Chapter 23 351
Stress-Strength Analysis 351
Comparing Life Data Sets 361
Risk Analysis and Probabilistic Design with Monte Carlo Simulation 365
Weibull++ SimuMatic 372
Target Reliability Tool 375
Event Log Data Analysis 383
Appendices 395
Least Squares/Rank Regression Equations 395
Appendix: Maximum Likelihood Estimation Example 400
Appendix: Special Analysis Methods 404
Appendix: Log-Likelihood Equations 410
Appendix: Life Data Analysis References 425
1
Chapter 1
where:
and:
scale parameter, or characteristic life
shape parameter (or slope)
location parameter (or failure free life)
Introduction to Life Data Analysis 2
Some distributions, such as the Weibull and lognormal, tend to better represent life data and are commonly called
"lifetime distributions" or "life distributions." In fact, life data analysis is sometimes called "Weibull analysis"
because the Weibull distribution, formulated by Professor Waloddi Weibull, is a popular distribution for analyzing
life data. The Weibull model can be applied in a variety of forms (including 1-parameter, 2-parameter, 3-parameter
or mixed Weibull). Other commonly used life distributions include the exponential, lognormal and normal
distributions. The analyst chooses the life distribution that is most appropriate to model each particular data set based
on past experience and goodness-of-fit tests.
Parameter Estimation
In order to fit a statistical model to a life data set, the analyst estimates the parameters of the life distribution that will
make the function most closely fit the data. The parameters control the scale, shape and location of the pdf function.
For example, in the 3-parameter Weibull model (shown above), the scale parameter, , defines where the bulk of the
distribution lies. The shape parameter, , defines the shape of the distribution and the location parameter, ,
defines the location of the distribution in time.
Several methods have been devised to estimate the parameters that will fit a lifetime distribution to a particular data
set. Some available parameter estimation methods include probability plotting, rank regression on x (RRX), rank
regression on y (RRY) and maximum likelihood estimation (MLE). The appropriate analysis method will vary
depending on the data set and, in some cases, on the life distribution selected.
• Failure Rate: The number of failures per unit time that can be expected to occur for the product.
• Warranty Time: The estimated time when the reliability will be equal to a specified goal. For example, the
estimated time of operation is 4 years for a reliability of 90%.
• B(X) Life: The estimated time when the probability of failure will reach a specified point (X%). For example, if
10% of the products are expected to fail by 4 years of operation, then the B(10) life is 4 years. (Note that this is
equivalent to a warranty time of 4 years for a 90% reliability.)
• Probability Plot: A plot of the probability of failure over time. (Note that probability plots are based on the
linearization of a specific distribution. Consequently, the form of a probability plot for one distribution will be
different than the form for another. For example, an exponential distribution probability plot has different axes
than those of a normal distribution probability plot.)
• Reliability vs. Time Plot: A plot of the reliability over time.
• pdf Plot: A plot of the probability density function (pdf).
• Failure Rate vs. Time Plot: A plot of the failure rate over time.
• Contour Plot: A graphical representation of the possible solutions to the likelihood ratio equation. This is
employed to make comparisons between two different data sets.
Confidence Bounds
Because life data analysis results are estimates based on the observed lifetimes of a sampling of units, there is
uncertainty in the results due to the limited sample sizes. "Confidence bounds" (also called "confidence intervals")
are used to quantify this uncertainty due to sampling error by expressing the confidence that a specific interval
contains the quantity of interest. Whether or not a specific interval contains the quantity of interest is unknown.
Confidence bounds can be expressed as two-sided or one-sided. Two-sided bounds are used to indicate that the
quantity of interest is contained within the bounds with a specific confidence. One-sided bounds are used to indicate
that the quantity of interest is above the lower bound or below the upper bound with a specific confidence. The
appropriate type of bounds depends on the application. For example, the analyst would use a one-sided lower bound
on reliability, a one-sided upper bound for percent failing under warranty and two-sided bounds on the parameters of
the distribution. (Note that one-sided and two-sided bounds are related. For example, the 90% lower two-sided
bound is the 95% lower one-sided bound and the 90% upper two-sided bounds is the 95% upper one-sided bound.)
Reliability Engineering
Since the beginning of history, humanity has attempted to predict the future. Watching the flight of birds, the
movement of the leaves on the trees and other methods were some of the practices used. Fortunately, today's
engineers do not have to depend on Pythia or a crystal ball in order to predict the future of their products. Through
the use of life data analysis, reliability engineers use product life data to determine the probability and capability of
parts, components, and systems to perform their required functions for desired periods of time without failure, in
specified environments.
Introduction to Life Data Analysis 4
Life data can be lifetimes of products in the marketplace, such as the time the product operated successfully or the
time the product operated before it failed. These lifetimes can be measured in hours, miles, cycles-to-failure, stress
cycles or any other metric with which the life or exposure of a product can be measured. All such data of product
lifetimes can be encompassed in the term life data or, more specifically, product life data. The subsequent analysis
and prediction are described as life data analysis. For the purpose of this reference, we will limit our examples and
discussions to lifetimes of inanimate objects, such as equipment, components and systems as they apply to reliability
engineering. Before performing life data analysis, the failure mode and the life units (hours, cycles, miles, etc.) must
be specified and clearly defined. Further, it is quite necessary to define exactly what constitutes a failure. In other
words, before performing the analysis it must be clear when the product is considered to have actually failed. This
may seem rather obvious, but it is not uncommon for problems with failure definitions or time unit discrepancies to
completely invalidate the results of expensive and time consuming life testing and analysis.
Estimation
In life data analysis and reliability engineering, the output of the analysis is always an estimate. The true value of the
probability of failure, the probability of success (or reliability ), the mean life, the parameters of a distribution or any
other applicable parameter is never known, and will almost certainly remain unknown to us for all practical
purposes. Granted, once a product is no longer manufactured and all units that were ever produced have failed and
all of that data has been collected and analyzed, one could claim to have learned the true value of the reliability of
the product. Obviously, this is not a common occurrence. The objective of reliability engineering and life data
analysis is to accurately estimate these true values. For example, let's assume that our job is to estimate the number
of black marbles in a giant swimming pool filled with black and white marbles. One method is to pick out a small
sample of marbles and count the black ones. Suppose we picked out ten marbles and counted four black marbles.
Introduction to Life Data Analysis 5
Based on this sampling, the estimate would be that 40% of the marbles are black. If we put the ten marbles back in
the pool and repeated this step again, we might get five black marbles, changing the estimate to 50% black marbles.
The range of our estimate for the percentage of black marbles in the pool is 40% to 50%. If we now repeat the
experiment and pick out 1,000 marbles, we might get results for the number of black marbles such as 445 and 495
black marbles for each trial. In this case, we note that our estimate for the percentage of black marbles has a
narrower range, or 44.5% to 49.5%. Using this, we can see that the larger the sample size, the narrower the estimate
range and, presumably, the closer the estimate range is to the true value.
A Formal Definition
Reliability engineering provides the theoretical and practical tools whereby the probability and capability of parts,
components, equipment, products and systems to perform their required functions for desired periods of time without
failure, in specified environments and with a desired confidence, can be specified, designed in, predicted, tested and
demonstrated, as discussed in Kececioglu [19].
Through proper testing and analysis in the in-house testing labs, as well as collection of adequate and meaningful
data on a product's performance in the field, the reliability of any product can be measured, tracked and improved,
leading to a balanced organization with a financially healthy outlook for the future.
Introduction to Life Data Analysis 6
Reliability engineering covers all aspects of a product's life, from its conception, subsequent design and production
processes, through its practical use lifetime, with maintenance support and availability. Reliability engineering
covers:
Introduction to Life Data Analysis 7
1. Reliability
2. Maintainability
3. Availability
All three of these areas can be numerically quantified with the use of reliability engineering principles and life data
analysis. And the combination of these three areas introduces a new term, as defined in ISO-9000-4,
"Dependability."
This curve is plotted with the product life on the x-axis and with the failure rate on the y-axis. The life can be in
minutes, hours, years, cycles, actuations or any other quantifiable unit of time or use. The failure rate is given as
failures among surviving units per time unit. As can be seen from this plot, many products will begin their lives with
a higher failure rate (which can be due to manufacturing defects, poor workmanship, poor quality control of
incoming parts, etc.) and exhibit a decreasing failure rate. The failure rate then usually stabilizes to an approximately
constant rate in the useful life region, where the failures observed are chance failures. As the products experience
more use and wear, the failure rate begins to rise as the population begins to experience failures related to wear-out.
In the case of human mortality, the mortality rate (failure rate), is higher during the first year or so of life, then drops
to a low constant level during our teens and early adult life and then rises as we progress in years.
Burn-In
Looking at this particular bathtub curve, it should be fairly obvious that it would be best to ship a product at the
beginning of the useful life region, rather than right off the production line; thus preventing the customer from
experiencing early failures. This practice is what is commonly referred to as "burn-in", and is frequently performed
for electronic components. The determination of the correct burn-in time requires the use of reliability
methodologies, as well as optimization of costs involved (i.e., costs of early failures vs. the cost of burn-in), to
determine the optimum failure rate at shipment.
Introduction to Life Data Analysis 8
If the producer increases the reliability of his product, he will increase the cost of the design and/or production of the
product. However, a low production and design cost does not imply a low overall product cost. The overall product
cost should not be calculated as merely the cost of the product when it leaves the shipping dock, but as the total cost
of the product through its lifetime. This includes warranty and replacement costs for defective products, costs
incurred by loss of customers due to defective products, loss of subsequent sales, etc. By increasing product
reliability, one may increase the initial product costs, but decrease the support costs. An optimum minimal total
product cost can be determined and implemented by calculating the optimum reliability for such a product. The
figure depicts such a scenario. The total product cost is the sum of the production and design costs as well as the
other post-shipment costs. It can be seen that at an optimum reliability level, the total product cost is at a minimum.
The "optimum reliability level" is the one that coincides with the minimum total cost over the entire lifetime of the
product.
11. Guidance regarding corrective action decisions to minimize failures and reduce maintenance and repair times,
which will eliminate overdesign as well as underdesign.
12. Help provide guidelines for quality control practices.
13. Optimization of the reliability goal that should be designed into products and systems for minimum total cost to
own, operate and maintain for their lifetime.
14. The ability to conduct trade-off studies among parameters such as reliability, maintainability, availability, cost,
weight, volume, operability, serviceability and safety to obtain the optimum design.
15. Reduction of warranty costs or, for the same cost, increase in the length and the coverage of warranty.
16. Establishment of guidelines for evaluating suppliers from the point of view of their product reliability.
17. Promotion of sales on the basis of reliability indexes and metrics through sales and marketing departments.
18. Increase of customer satisfaction and an increase of sales as a result of customer satisfaction.
19. Increase of profits or, for the same profit, provision of even more reliable products and systems.
20. Promotion of positive image and company reputation.
Chapter 2
Random Variables
In general, most problems in reliability engineering deal with quantitative measures, such as the time-to-failure of a
component, or qualitative measures, such as whether a component is defective or non-defective. We can then use a
random variable to denote these possible measures.
In the case of times-to-failure, our random variable is the time-to-failure of the component and can take on an
infinite number of possible values in a range from 0 to infinity (since we do not know the exact time a priori). Our
component can be found failed at any time after time 0 (e.g., at 12 hours or at 100 hours and so forth), thus can
take on any value in this range. In this case, our random variable is said to be a continuous random variable. In
this reference, we will deal almost exclusively with continuous random variables.
In judging a component to be defective or non-defective, only two outcomes are possible. That is, is a random
variable that can take on one of only two values (let's say defective = 0 and non-defective = 1). In this case, the
variable is said to be a discrete random variable.
If is a continuous random variable, then the pdf of is a function, , such that for any two numbers, and
with :
That is, the probability that takes on a value in the interval is the area under the density function from to
as shown above. The pdf represents the relative frequency of failure times as a function of time.
The cdf is a function, , of a random variable , and is defined for a number by:
Basic Statistical Background 12
That is, for a number , is the probability that the observed value of will be at most . The cdf represents
the cumulative values of the pdf. That is, the value of a point on the curve of the cdf represents the area under the
curve to the left of that point on the pdf. In reliability, the cdf is used to measure the probability that the item in
question will fail before the associated time value, , and is also called unreliability.
Note that depending on the density function, denoted by , the limits will vary based on the region over which
the distribution is defined. For example, for the life distributions considered in this reference, with the exception of
the normal distribution, this range would be
The cdf is the area under the probability density function up to a value of . The total area under the pdf is always
equal to 1, or mathematically:
The well-known normal (or Gaussian) distribution is an example of a probability density function. The pdf for this
distribution is given by:
where is the mean and is the standard deviation. The normal distribution has two parameters, and .
Another is the lognormal distribution, whose pdf is given by:
Basic Statistical Background 13
where is the mean of the natural logarithms of the times-to-failure and is the standard deviation of the natural
logarithms of the times-to-failure. Again, this is a 2-parameter distribution.
Reliability Function
The reliability function can be derived using the previous definition of the cumulative distribution function,
. From our definition of the cdf, the probability of an event occurring by time is given by:
Or, one could equate this event to the probability of a unit failing by time .
Since this function defines the probability of failure by a certain time, we could consider this the unreliability
function. Subtracting this probability from 1 will give us the reliability function, one of the most important functions
in life data analysis. The reliability function gives the probability of success of a unit undertaking a mission of a
given time duration. The following figure illustrates this.
To show this mathematically, we first define the unreliability function, , which is the probability of failure, or
the probability that our time-to-failure is in the region of 0 and . This is the same as the cdf. So from
:
Reliability and unreliability are the only two events being considered and they are mutually exclusive; hence, the
sum of these probabilities is equal to unity.
Then:
Basic Statistical Background 14
Conversely:
This gives the instantaneous failure rate, also known as the hazard function. It is useful in characterizing the failure
behavior of a component, determining maintenance crew allocation, planning for spares provisioning, etc. Failure
rate is denoted as failures per unit time.
This is the expected or average time-to-failure and is denoted as the MTTF (Mean Time To Failure).
The MTTF, even though an index of reliability performance, does not give any information on the failure
distribution of the component in question when dealing with most lifetime distributions. Because vastly different
distributions can have identical means, it is unwise to use the MTTF as the sole measure of the reliability of a
component.
Basic Statistical Background 15
Median Life
Median life, , is the value of the random variable that has exactly one-half of the area under the pdf to its left and
one-half to its right. It represents the centroid of the distribution. The median is obtained by solving the following
equation for . (For individual data, the median is the midpoint value.)
For a continuous distribution, the mode is that value of that corresponds to the maximum probability density (the
value at which the pdf has its maximum value, or the peak of the curve).
Lifetime Distributions
A statistical distribution is fully described by its pdf. In the previous sections, we used the definition of the pdf to
show how all other functions most commonly used in reliability engineering and life data analysis can be derived.
The reliability function, failure rate function, mean time function, and median life function can be determined
directly from the pdf definition, or . Different distributions exist, such as the normal (Gaussian), exponential,
Weibull, etc., and each has a predefined form of that can be found in many references. In fact, there are certain
references that are devoted exclusively to different types of statistical distributions. These distributions were
formulated by statisticians, mathematicians and engineers to mathematically model or represent certain behavior. For
example, the Weibull distribution was formulated by Waloddi Weibull and thus it bears his name. Some distributions
tend to better represent life data and are most commonly called "lifetime distributions".
A more detailed introduction to this topic is presented in Life Distributions.
16
Chapter 3
Life Distributions
We use the term life distributions to describe the collection of statistical probability distributions that we use in
reliability engineering and life data analysis. A statistical distribution is fully described by its pdf (or probability
density function). In the previous sections, we used the definition of the pdf to show how all other functions most
commonly used in reliability engineering and life data analysis can be derived; namely, the reliability function,
failure rate function, mean time function and median life function, etc. All of these can be determined directly from
the pdf definition, or . Different distributions exist, such as the normal, exponential, etc., and each one of them
has a predefined form of . These distribution definitions can be found in many references. In fact, entire texts
have been dedicated to defining families of statistical distributions. These distributions were formulated by
statisticians, mathematicians and engineers to mathematically model or represent certain behavior. For example, the
Weibull distribution was formulated by Waloddi Weibull, and thus it bears his name. Some distributions tend to
better represent life data and are commonly called lifetime distributions. One of the simplest and most commonly
used distributions (and often erroneously overused due to its simplicity) is the exponential distribution. The pdf of
the exponential distribution is mathematically defined as:
In this definition, note that is our random variable, which represents time, and the Greek letter (lambda)
represents what is commonly referred to as the parameter of the distribution. Depending on the value of will
be scaled differently. For any distribution, the parameter or parameters of the distribution are estimated from the
data. For example, the well-known normal (or Gaussian) distribution is given by:
, the mean, and , the standard deviation, are its parameters. Both of these parameters are estimated from the data
(i.e., the mean and standard deviation of the data). Once these parameters have been estimated, our function is
fully defined and we can obtain any value for given any value of .
Given the mathematical representation of a distribution, we can also derive all of the functions needed for life data
analysis, which again will depend only on the value of after the value of the distribution parameter or parameters
have been estimated from data. For example, we know that the exponential distribution pdf is given by:
This exact same methodology can be applied to any distribution given its pdf, with various degrees of difficulty
depending on the complexity of .
Parameter Types
Distributions can have any number of parameters. Do note that as the number of parameters increases, so does the
amount of data required for a proper fit. In general, the lifetime distributions used for reliability and life data
analysis are usually limited to a maximum of three parameters. These three parameters are usually known as the
scale parameter, the shape parameter and the location parameter.
Scale Parameter The scale parameter is the most common type of parameter. All distributions in this reference have
a scale parameter. In the case of one-parameter distributions, the sole parameter is the scale parameter. The scale
parameter defines where the bulk of the distribution lies, or how stretched out the distribution is. In the case of the
normal distribution, the scale parameter is the standard deviation.
Shape Parameter The shape parameter, as the name implies, helps define the shape of a distribution. Some
distributions, such as the exponential or normal, do not have a shape parameter since they have a predefined shape
that does not change. In the case of the normal distribution, the shape is always the familiar bell shape. The effect of
the shape parameter on a distribution is reflected in the shapes of the pdf, the reliability function and the failure rate
function.
Location Parameter The location parameter is used to shift a distribution in one direction or another. The location
parameter, usually denoted as , defines the location of the origin of a distribution and can be either positive or
negative. In terms of lifetime distributions, the location parameter represents a time shift.
This means that the inclusion of a location parameter for a distribution whose domain is normally will change
the domain to , where can either be positive or negative. This can have some profound effects in terms of
reliability. For a positive location parameter, this indicates that the reliability for that particular distribution is always
100% up to that point. In other words, a failure cannot occur before this time . Many engineers feel uncomfortable
in saying that a failure will absolutely not happen before any given time. On the other hand, the argument can be
Life Distributions 18
made that almost all life distributions have a location parameter, although many of them may be negligibly small.
Similarly, many people are uncomfortable with the concept of a negative location parameter, which states that
failures theoretically occur before time zero. Realistically, the calculation of a negative location parameter is
indicative of quiescent failures (failures that occur before a product is used for the first time) or of problems with the
manufacturing, packaging or shipping process. More attention will be given to the concept of the location parameter
in subsequent discussions of the exponential and Weibull distributions, which are the lifetime distributions that most
frequently employ the location parameter.
Where is the constant failure rate in failures per unit of measurement (e.g., failures per hour, per cycle, etc.) and
is the location parameter. In addition, , where is the mean time between failures (or to failure).
If the location parameter, , is assumed to be zero, then the distribution becomes the 1-parameter exponential or:
One additional form is the 1-parameter Weibull distribution, which assumes that the location parameter, is zero,
and the shape parameter is a known constant, or = constant = , so:
Bayesian-Weibull Analysis
Another approach is the Weibull-Bayesian analysis method, which assumes that the analyst has some prior
knowledge about the distribution of the shape parameter of the Weibull distribution (beta). There are many practical
applications for this model, particularly when dealing with small sample sizes and/or when some prior knowledge for
the shape parameter is available. For example, when a test is performed, there is often a good understanding about
the behavior of the failure mode under investigation, primarily through historical data or physics-of-failure.
Note that this is not the same as the so called "WeiBayes model," which is really a one-parameter Weibull
distribution that assumes a fixed value (constant) for the shape parameter and solves for the scale parameter. The
Bayesian-Weibull feature in Weibull++ is actually a true Bayesian model and offers an alternative to the
one-parameter Weibull by including the variation and uncertainty that is present in the prior estimation of the shape
parameter.
This analysis method and its characteristics are presented in detail in Bayesian-Weibull Analysis.
where is the mean of the normal times to failure and is the standard deviation of the times to failure.
The normal distribution and its characteristics are presented in The Normal Distribution.
where is the mean of the natural logarithms of the times-to-failure and is the standard deviation of the natural
logarithms of the times to failure.
For a detailed discussion of this distribution, see The Lognormal Distribution.
Other Distributions
In addition to the distributions mentioned above, which are more frequently used in life data analysis, the following
distributions also have a variety of applications and can be found in many statistical references. They are included in
Weibull++, as well as discussed in this reference.
where the value of is equal to the number of subpopulations. Note that this results in a total of
parameters. In other words, each population has a portion or mixing weight for the population, a , or shape
parameter for the population and or scale parameter for population. Note that the parameters are reduced
to , given the fact that the following condition can also be used:
The mixed Weibull distribution and its characteristics are presented in The Mixed Weibull Distribution.
This distribution behaves as do other distributions based on the values of the parameters. For example, if ,
then the distribution is identical to the Weibull distribution. If both and , then the distribution is
identical to the exponential distribution, and for it is identical to the lognormal distribution. While the
generalized gamma distribution is not often used to model life data by itself, its ability to behave like other more
commonly-used life distributions is sometimes used to determine which of those life distributions should be used to
model a particular set of data.
The generalized gamma distribution and its characteristics are presented in The Generalized Gamma Distribution.
where:
where , and .
The gamma distribution and its characteristics are presented in The Gamma Distribution.
Life Distributions 21
where:
The logistic distribution and its characteristics are presented in The Logistic Distribution.
where:
The loglogistic distribution and its characteristics are presented in The Loglogistic Distribution.
where:
Life Distributions 22
The Gumbel distribution and its characteristics are presented in The Gumbel/SEV Distribution.
23
Chapter 4
Parameter Estimation
The term parameter estimation refers to the process of using sample data (in reliability engineering, usually
times-to-failure or success data) to estimate the parameters of the selected distribution. Several parameter estimation
methods are available. This section presents an overview of the available methods used in life data analysis. More
specifically, we start with the relatively simple method of Probability Plotting and continue with the more
sophisticated methods of Rank Regression (or Least Squares), Maximum Likelihood Estimation and Bayesian
Estimation Methods.
Probability Plotting
The least mathematically intensive method for parameter estimation is the method of probability plotting. As the
term implies, probability plotting involves a physical plot of the data on specially constructed probability plotting
paper. This method is easily implemented by hand, given that one can obtain the appropriate probability plotting
paper.
The method of probability plotting takes the cdf of the distribution and attempts to linearize it by employing a
specially constructed paper. The following sections illustrate the steps in this method using the 2-parameter Weibull
distribution as an example. This includes:
• Linearize the unreliability function
• Construct the probability plotting paper
• Determine the X and Y positions of the plot points
And then using the plot to read any particular time or reliability/unreliability value of interest.
This function can then be linearized (i.e., put in the common form of format) as follows:
Then by setting:
Parameter Estimation 24
and:
To illustrate, consider the following probability plot on a slightly different type of Weibull probability paper.
Parameter Estimation 25
This paper is constructed based on the mentioned y and x transformations, where the y-axis represents unreliability
and the x-axis represents time. Both of these values must be known for each time-to-failure point we want to plot.
Then, given the and value for each point, the points can easily be put on the plot. Once the points have been
placed on the plot, the best possible straight line is drawn through these points. Once the line has been drawn, the
slope of the line can be obtained (some probability papers include a slope indicator to simplify this calculation). This
is the parameter , which is the value of the slope. To determine the scale parameter, (also called the
characteristic life), one reads the time from the x-axis corresponding to .
Note that at:
Thus, if we enter the y axis at , the corresponding value of will be equal to . Thus, using this
simple methodology, the parameters of the Weibull distribution can be estimated.
Parameter Estimation 26
Median Ranks
The Median Ranks method is used to obtain an estimate of the unreliability for each failure. The median rank is the
value that the true probability of failure, , should have at the failure out of a sample of units at the
50% confidence level.
The rank can be found for any percentage point, , greater than zero and less than one, by solving the cumulative
binomial equation for . This represents the rank, or unreliability estimate, for the failure in the following
equation for the cumulative binomial:
For example, if and we have four failures, we would solve the median rank equation for the value of four
times; once for each failure with . This result can then be used as the unreliability estimate for
each failure or the plotting position. (See also The Weibull Distribution for a step-by-step example of this method.)
The solution of cumulative binomial equation for requires the use of numerical methods.
where denotes the distribution at the 0.50 point, with and degrees of freedom, for failure out of
units.
Parameter Estimation 27
Kaplan-Meier
The Kaplan-Meier estimator (also known as the product limit estimator) is used as an alternative to the median ranks
method for calculating the estimates of the unreliability for probability plotting purposes. The equation of the
estimator is given by:
where:
Obtain their median rank plotting positions. Median rank positions are used instead of other ranking methods
because median ranks are at a specific confidence level (50%).
The times-to-failure, with their corresponding median ranks, are shown next:
Parameter Estimation 28
On an exponential probability paper, plot the times on the x-axis and their corresponding rank value on the y-axis.
The next figure displays an example of an exponential probability paper. The paper is simply a log-linear paper.
Draw the best possible straight line that goes through the and point and through the plotted
points (as shown in the plot below).
At the or ordinate point, draw a straight horizontal line until this line intersects
the fitted straight line. Draw a vertical line through this intersection until it crosses the abscissa. The value at the
intersection of the abscissa is the estimate of the mean. For this case, hours which means that
(This is always at 63.2% because .
Parameter Estimation 29
Now any reliability value for any mission time can be obtained. For example, the reliability for a mission of 15
hours, or any other time, can now be obtained either from the plot or analytically.
To obtain the value from the plot, draw a vertical line from the abscissa, at hours, to the fitted line. Draw a
horizontal line from this intersection to the ordinate and read . In this case, . This can
also be obtained analytically, from the exponential reliability function.
The method of least squares requires that a straight line be fitted to a set of data points, such that the sum of the
squares of the distance of the points to the fitted line is minimized. This minimization can be performed in either the
vertical or horizontal direction. If the regression is on , then the line is fitted so that the horizontal deviations from
the points to the line are minimized. If the regression is on Y, then this means that the distance of the vertical
deviations from the points to the line is minimized. This is illustrated in the following figure.
Rank Regression on Y
Assume that a set of data pairs , ,..., were obtained and plotted, and that the -values
are known exactly. Then, according to the least squares principle, which minimizes the vertical distance between the
data points and the straight line fitted to the data, the best fitting straight line to these data is the straight line
(where the recently introduced ( ) symbol indicates that this value is an estimate) such that:
and where and are the least squares estimates of and , and is the number of data points. These equations
are minimized by estimates of and such that:
and:
Parameter Estimation 31
Rank Regression on X
Assume that a set of data pairs .., ,..., were obtained and plotted, and that the y-values are known
exactly. The same least squares principle is applied, but this time, minimizing the horizontal distance between the
data points and the straight line fitted to the data. The best fitting straight line to these data is the straight line
such that:
Again, and are the least squares estimates of and , and is the number of data points. These equations are
minimized by estimates of and such that:
and:
The corresponding relations for determining the parameters for specific distributions (i.e., Weibull, exponential,
etc.), are presented in the chapters covering that distribution.
Correlation Coefficient
The correlation coefficient is a measure of how well the linear regression model fits the data and is usually denoted
by . In the case of life data analysis, it is a measure for the strength of the linear relation (correlation) between the
median ranks and the data. The population correlation coefficient is defined as follows:
The range of is .
Parameter Estimation 32
The closer the value is to , the better the linear fit. Note that +1 indicates a perfect fit (the paired values ( )
lie on a straight line) with a positive slope, while -1 indicates a perfect fit with a negative slope. A correlation
coefficient value of zero would indicate that the data are randomly scattered and have no pattern or correlation in
relation to the regression line model.
1 5,100
2 9,500
3 15,000
4 22,000
5 40,000
The methodology for plotting suspended items involves adjusting the rank positions and plotting the data based on
new positions, determined by the location of the suspensions. If we consider these five units, the following
methodology would be used: The first item must be the first failure; hence, it is assigned failure order number
. The actual failure order number (or position) of the second failure, is in doubt. It could either be in position 2 or
in position 3. Had not been withdrawn from the test at 9,500 hours, it could have operated successfully past
15,000 hours, thus placing in position 2. Alternatively, could also have failed before 15,000 hours, thus
placing in position 3. In this case, the failure order number for will be some number between 2 and 3. To
determine this number, consider the following:
We can find the number of ways the second failure can occur in either order number 2 (position 2) or order number 3
(position 3). The possible ways are listed next.
in Position 2 OR in Position 3
1 2 3 4 5 6 1 2
It can be seen that can occur in the second position six ways and in the third position two ways. The most
probable position is the average of these possible ways, or the mean order number ( MON ), given by:
Using the same logic on the third failure, it can be located in position numbers 3, 4 and 5 in the possible ways listed
next.
Parameter Estimation 34
>
Then, the mean order number for the third failure, (item 5) is:
Once the mean order number for each failure has been established, we obtain the median rank positions for these
failures at their mean order number. Specifically, we obtain the median rank of the order numbers 1, 2.25 and 4.125
out of a sample size of 5, as given next.
1: 1 13%
2: 2.25 36%
3: 4.125 71%
Once the median rank values have been obtained, the probability plotting analysis is identical to that presented
before. As you might have noticed, this methodology is rather laborious. Other techniques and shortcuts have been
developed over the years to streamline this procedure. For more details on this method, see Kececioglu [20]. Here,
we will introduce one of these methods. This method calculates MON using an increment, I, which is defined by:
Where
• N= the sample size, or total number of items in the test
• PMON = previous mean order number
• NIBPSS = the number of items beyond the present suspended set. It is the number of units (including all the
failures and suspensions) at the current failure time.
• i = the ith failure item
MON is given as:
For F2:
For F3:
Parameter Estimation 35
The MON obtained for each failure item via this method is same as from the first method, so the median rank values
will also be the same.
For Grouped data, the increment at each failure group will be multiplied by the number of failures in that group.
Case 1 Case 2
Item Number State*"F" or "S" Life of an item, hr Item number State*,"F" or "S" Life of item, hr
1 1,000 1 1,000
2 1,100 2 9,700
3 1,200 3 9,800
4 1,300 4 9,900
5 10,000 5 10,000
This shortfall is significant when the number of failures is small and the number of suspensions is large and not
spread uniformly between failures, as with these data. In cases like this, it is highly recommended to use maximum
likelihood estimation (MLE) to estimate the parameters instead of using least squares, because MLE does not look at
ranks or plotting positions, but rather considers each unique time-to-failure or suspension. For the data given above,
the results are as follows. The estimated parameters using the method just described are the same for both cases (1
and 2):
As we can see, there is a sizable difference in the results of the two sets calculated using MLE and the results using
regression with the SRM. The results for both cases are identical when using the regression estimation technique
with SRM, as SRM considers only the positions of the suspensions. The MLE results are quite different for the two
cases, with the second case having a much larger value of , which is due to the higher values of the suspension
times in Case 2. This is because the maximum likelihood technique, unlike rank regression with SRM, considers the
values of the suspensions when estimating the parameters. This is illustrated in the discussion of MLE given below.
One alternative to improve the regression method is to use the following ReliaSoft Ranking Method (RRM) to
calculate the rank. RRM does consider the effect of the censoring time.
Parameter Estimation 36
where are unknown parameters which need to be estimated, with R independent observations,
, which correspond in the case of life data analysis to failure times. The likelihood function is given
by:
The maximum likelihood estimators (or parameter values) of are obtained by maximizing or .
By maximizing which is much easier to work with than , the maximum likelihood estimators (MLE) of
are the simultaneous solutions of equations such that:
Even though it is common practice to plot the MLE solutions using median ranks (points are plotted according to
median ranks and the line according to the MLE solutions), this is not completely representative. As can be seen
from the equations above, the MLE method is independent of any kind of ranks. For this reason, the MLE solution
often appears not to track the data on the probability plot. This is perfectly acceptable because the two methods are
independent of each other, and in no way suggests that the solution is wrong.
Parameter Estimation 37
where are the unknown parameters which need to be estimated from observed failures at
, and observed suspensions at then the likelihood function is formulated as
follows:
The parameters are solved by maximizing this equation. In most cases, no closed-form solution exists for this
maximum or for the parameters. Solutions specific to each distribution utilizing MLE are presented in Appendix D.
Note that if only interval data are present, this term will represent the entire likelihood function for the MLE
solution. The next section gives a formulation of the complete likelihood function for all possible censoring schemes.
where:
and:
• is the number of units with exact failures
Parameter Estimation 38
Weibull++. Extensive coverage of the subject can be found in numerous books dealing with Bayesian statistics.
Bayes’s Rule
Bayes’s rule provides the framework for combining prior information with sample data. In this reference, we apply
Bayes’s rule for combining prior information on the assumed distribution's parameter(s) with sample data in order to
make inferences based on the model. The prior knowledge about the parameter(s) is expressed in terms of a
called the prior distribution. The posterior distribution of given the sample data, using Bayes's rule, provides the
updated information about the parameters . This is expressed with the following posterior pdf:
where:
• is a vector of the parameters of the chosen distribution
• is the range of
• is the likelihood function based on the chosen distribution and data
• is the prior distribution for each of the parameters
The integral in the Bayes's rule equation is often referred to as the marginal probability, which is a constant number
that can be interpreted as the probability of obtaining the sample data given a prior distribution. Generally, the
integral in the Bayes's rule equation does not have a closed form solution and numerical methods are needed for its
solution.
As can be seen from the Bayes's rule equation, there is a significant difference between classical and Bayesian
statistics. First, the idea of prior information does not exist in classical statistics. All inferences in classical statistics
are based on the sample data. On the other hand, in the Bayesian framework, prior information constitutes the basis
of the theory. Another difference is in the overall approach of making inferences and their interpretation. For
example, in Bayesian analysis, the parameters of the distribution to be fitted are the random variables. In reality,
there is no distribution fitted to the data in the Bayesian case.
For instance, consider the case where data is obtained from a reliability test. Based on prior experience on a similar
product, the analyst believes that the shape parameter of the Weibull distribution has a value between and and
wants to utilize this information. This can be achieved by using the Bayes theorem. At this point, the analyst is
automatically forcing the Weibull distribution as a model for the data and with a shape parameter between and
. In this example, the range of values for the shape parameter is the prior distribution, which in this case is Uniform.
By applying Bayes's rule, the posterior distribution of the shape parameter will be obtained. Thus, we end up with a
distribution for the parameter rather than an estimate of the parameter, as in classical statistics.
To better illustrate the example, assume that a set of failure data was provided along with a distribution for the shape
parameter (i.e., uniform prior) of the Weibull (automatically assuming that the data are Weibull distributed). Based
on that, a new distribution (the posterior) for that parameter is then obtained using Bayes's rule. This posterior
distribution of the parameter may or may not resemble in form the assumed prior distribution. In other words, in this
example the prior distribution of was assumed to be uniform but the posterior is most likely not a uniform
distribution.
The question now becomes: what is the value of the shape parameter? What about the reliability and other results of
interest? In order to answer these questions, we have to remember that in the Bayesian framework all of these
metrics are random variables. Therefore, in order to obtain an estimate, a probability needs to be specified or we can
use the expected value of the posterior distribution.
In order to demonstrate the procedure of obtaining results from the posterior distribution, we will rewrite the Bayes's
rule equation for a single parameter :
Parameter Estimation 40
The expected value (or mean value) of the parameter can be obtained using the equation for the mean and the
Bayes's rule equation for single parameter:
An alternative result for would be the median value. Using the equation for the median and the Bayes's rule
equation for a single parameter:
The equation for the median is solved for the median value of
Similarly, any other percentile of the posterior pdf can be calculated and reported. For example, one could calculate
the 90th percentile of ’s posterior pdf:
This calculation will be used in Confidence Bounds and The Weibull Distribution for obtaining confidence bounds
on the parameter(s).
The next step will be to make inferences on the reliability. Since the parameter is a random variable described by
the posterior pdf, all subsequent functions of are distributed random variables as well and are entirely based on the
posterior pdf of . Therefore, expected value, median or other percentile values will also need to be calculated. For
example, the expected reliability at time is:
In other words, at a given time , there is a distribution that governs the reliability value at that time, , and by
using Bayes's rule, the expected (or mean) value of the reliability is obtained. Other percentiles of this distribution
can also be obtained. A similar procedure is followed for other functions of , such as failure rate, reliable life, etc.
Prior Distributions
Prior distributions play a very important role in Bayesian Statistics. They are essentially the basis in Bayesian
analysis. Different types of prior distributions exist, namely informative and non-informative. Non-informative prior
distributions (a.k.a. vague, flat and diffuse) are distributions that have no population basis and play a minimal role in
the posterior distribution. The idea behind the use of non-informative prior distributions is to make inferences that
are not greatly affected by external information or when external information is not available. The uniform
distribution is frequently used as a non-informative prior.
On the other hand, informative priors have a stronger influence on the posterior distribution. The influence of the
prior distribution on the posterior is related to the sample size of the data and the form of the prior. Generally
speaking, large sample sizes are required to modify strong priors, where weak priors are overwhelmed by even
relatively small sample sizes. Informative priors are typically obtained from past data.
References
[1] http:/ / www. weibull. com/ GPaper/ index. htm
41
Chapter 5
Complete Data
Complete data means that the value of each sample unit is observed or known. For example, if we had to compute
the average test score for a sample of ten students, complete data would consist of the known score for each student.
Likewise in the case of life data analysis, our data set (if complete) would be composed of the times-to-failure of all
units in our sample. For example, if we tested five units and they all failed (and their times-to-failure were recorded),
we would then have complete information as to the time of each failure in the sample.
Life Data Classification 42
Censored Data
In many cases, all of the units in the sample may not have failed (i.e., the event of interest was not observed) or the
exact times-to-failure of all the units are not known. This type of data is commonly called censored data. There are
three types of possible censoring schemes, right censored (also called suspended data), interval censored and left
censored.
It is generally recommended to avoid interval censored data because they are less informative compared to complete
data. However, there are cases when interval data are unavoidable due to the nature of the product, the test and the
test equipment. In those cases, caution must be taken to set the inspection intervals to be short enough to observe the
spread of the failures. For example, if the inspection interval is too long, all the units in the test may fail within that
interval, and thus no failure distribution could be obtained.
If the objective of the analysis is to determine the probability of failure of the product, regardless of the mode
responsible for the failure, we would analyze the data with all data entries classified as failures (complete data).
However, if the objective of the analysis is to determine the probability of failure of the product due to Mode A only,
we would then choose to treat failures due to Modes B or C as suspension (right censored) data. Those data points
would be treated as suspension data with respect to Mode A because the product operated until the recorded time
without failure due to Mode A.
Fractional Failures
After the completion of a reliability test or after failures are observed in the field, a redesign can be implemented to
improve a product’s reliability. After the redesign, and before new failure data become available, it is often times
desirable to “adjust” the reliability that was calculated from the previous design and take “credit” for this theoretical
improvement. This can be achieved with fractional failures. Using past experience to estimate the effectiveness of a
corrective action or redesign, an analysis can take credit for this improvement by adjusting the failure count.
Therefore, if a corrective action on a failure mode is believed to be 70% effective, then the failure count can be
reduced from 1 to 0.3 to reflect the effectiveness of the corrective action.
For example, consider the following data set.
Life Data Classification 45
In this case, a design change has been implemented for the failure mode that occurred at 168 hours and is assumed to
be 60% effective. In the background, Weibull++ converts this data set to:
If Rank Regression is used to estimate distribution parameters, the median ranks for the previous data set are
calculated as follows:
Given this information, the standard Rank Regression procedure is then followed to estimate parameters.
If Maximum Likelihood Estimation (MLE) is used to estimate distribution parameters, the grouped data likelihood
function is used with the number in group being a non-integer value.
Life Data Classification 46
Example
A component underwent a reliability test. 12 samples were run to failure. The following figure shows the failures
and the analysis in a Weibull++ standard folio.
The analysts believe that the planned design improvements will yield 50% effectiveness. To estimate the reliability
of the product based on the assumptions about the repair effectiveness, they enter the data in groups, counting a 0.5
failure for each group. The following figure shows the adjusted data set and the calculated parameters.
The following overlay plot of unreliability vs. time shows that by using fractional failures the estimated unreliability
of the component has decreased, while the B10 life has increased from 2,566 hours to 3,564 hours.
Life Data Classification 47
48
Chapter 6
Confidence Bounds
What Are Confidence Bounds?
One of the most confusing concepts to a novice reliability engineer is estimating the precision of an estimate. This is
an important concept in the field of reliability engineering, leading to the use of confidence intervals (or bounds). In
this section, we will try to briefly present the concept in relatively simple terms but based on solid common sense.
If you put the ten marbles back in the pool and repeat this example again, you might get six black marbles, changing
your estimate to 60% black marbles. Which of the two is correct? Both estimates are correct! As you repeat this
experiment over and over again, you might find out that this estimate is usually between and , and you
can assign a percentage to the number of times your estimate falls between these limits. For example, you notice that
90% of the time this estimate is between and
Taking a Larger Sample of Marbles
If you now repeat the experiment and pick out 1,000 marbles, you might get results for the number of black marbles
such as 545, 570, 530, etc., for each trial. The range of the estimates in this case will be much narrower than before.
For example, you observe that 90% of the time, the number of black marbles will now be from to , where
and , thus giving you a more narrow estimate interval. The same principle is true for
confidence intervals; the larger the sample size, the more narrow the confidence intervals.
Back to Reliability
Confidence Bounds 49
We will now look at how this phenomenon relates to reliability. Overall, the reliability engineer's task is to determine
the probability of failure, or reliability, of the population of units in question. However, one will never know the
exact reliability value of the population unless one is able to obtain and analyze the failure data for every single unit
in the population. Since this usually is not a realistic situation, the task then is to estimate the reliability based on a
sample, much like estimating the number of black marbles in the pool. If we perform ten different reliability tests for
our units, and analyze the results, we will obtain slightly different parameters for the distribution each time, and thus
slightly different reliability results. However, by employing confidence bounds, we obtain a range within which
these reliability values are likely to occur a certain percentage of the time. This helps us gauge the utility of the data
and the accuracy of the resulting estimates. Plus, it is always useful to remember that each parameter is an estimate
of the true parameter, one that is unknown to us. This range of plausible values is called a confidence interval.
When we use two-sided confidence bounds (or intervals), we are looking at a closed interval where a certain
percentage of the population is likely to lie. That is, we determine the values, or bounds, between which lies a
specified percentage of the population. For example, when dealing with 90% two-sided confidence bounds of
, we are saying that 90% of the population lies between and with 5% less than and 5% greater than
.
One-Sided Bounds
One-sided confidence bounds are essentially an open-ended version of two-sided bounds. A one-sided bound defines
the point where a certain percentage of the population is either higher or lower than the defined point. This means
that there are two types of one-sided bounds: upper and lower. An upper one-sided bound defines a point that a
certain percentage of the population is less than. Conversely, a lower one-sided bound defines a point that a specified
percentage of the population is greater than.
Confidence Bounds 50
For example, if is a 95% upper one-sided bound, this would imply that 95% of the population is less than . If
is a 95% lower one-sided bound, this would indicate that 95% of the population is greater than . Care must be
taken to differentiate between one- and two-sided confidence bounds, as these bounds can take on identical values at
different percentage levels. For example, in the figures above, we see bounds on a hypothetical distribution.
Assuming that this is the same distribution in all of the figures, we see that marks the spot below which 5% of the
distribution's population lies. Similarly, represents the point above which 5% of the population lies. Therefore,
and represent the 90% two-sided bounds, since 90% of the population lies between the two points. However,
also represents the lower one-sided 95% confidence bound, since 95% of the population lies above that point; and
represents the upper one-sided 95% confidence bound, since 95% of the population is below . It is important to be
sure of the type of bounds you are dealing with, particularly as both one-sided bounds can be displayed
simultaneously in Weibull++. In Weibull++, we use upper to represent the higher limit and lower to represent the
lower limit, regardless of their position, but based on the value of the results. So if obtaining the confidence bounds
on the reliability, we would identify the lower value of reliability as the lower limit and the higher value of reliability
as the higher limit. If obtaining the confidence bounds on probability of failure we will again identify the lower
numeric value for the probability of failure as the lower limit and the higher value as the higher limit.
where is some function of , such as the reliability function, and is the population parameter where
as . The term is a function of , the sample size, and tends to zero, as fast as as
For example, in the case of and , then where
. Thus as , where and are the mean and standard deviation,
respectively. Using the same one-parameter distribution, the variance of the function can then be estimated
by:
Two-Parameter Case
Consider a Weibull distribution with two parameters and . For a given value of ,
. Repeating the previous method for the case of a two-parameter distribution, it is
generally true that for a function , which is a function of two parameter estimators, say , that:
and:
Note that the derivatives of the above equation are evaluated at and where E and E
In the equation above, the first summation is for complete data, the second summation is for right censored data and
the third summation is for interval or left censored data.
Then the Fisher information matrix is given by:
The subscript 0 indicates that the quantity is evaluated at and the true values of the parameters.
So for a sample of units where units have failed, have been suspended, and have failed within a time
interval, and one could obtain the sample local information matrix by:
Substituting the values of the estimated parameters, in this case and , and then inverting the matrix, one can
then obtain the local estimate of the covariance matrix or:
Then the variance of a function ( ) can be estimated using equation for the variance. Values for the
variance and covariance of the parameters are obtained from Fisher Matrix equation. Once they have been obtained,
the approximate confidence bounds on the function are given as:
which is the estimated value plus or minus a certain number of standard deviations. We address finding next.
for large . We now place confidence bounds on at some confidence level , bounded by the two end points
and where:
Now by simplifying the equation for the confidence level, one can obtain the approximate two-sided confidence
bounds on the parameter at a confidence level or:
If must be positive, then is treated as normally distributed. The two-sided approximate confidence bounds on
the parameter , at confidence level , then become:
The one-sided approximate confidence bounds on the parameter , at confidence level can be found from:
The same procedure can be extended for the case of a two or more parameter distribution. Lloyd and Lipow [24]
further elaborate on this procedure.
Confidence Bounds 54
Bounds on time (Type 1) return the confidence bounds around this time value by determining the confidence
intervals around and substituting these values into the above equation. The bounds on are determined using the
method for the bounds on parameters, with its variance obtained from the Fisher Matrix. Note that the procedure is
slightly more complicated for distributions with more than one parameter.
Reliability bounds (Type 2) return the confidence bounds by determining the confidence intervals around and
substituting these values into the above equation. The bounds on are determined using the method for the bounds
on parameters, with its variance obtained from the Fisher Matrix. Once again, the procedure is more complicated for
distributions with more than one parameter.
The same methodology can then be repeated by changing for 0.50 (50%) to our desired confidence level. For
one would formulate the equation as
Keep in mind that one must be careful to select the appropriate values for based on the type of confidence bounds
desired. For example, if two-sided 80% confidence bounds are to be calculated, one must solve the equation twice
(once with and once with ) in order to place the bounds around 80% of the population.
Using this methodology, the appropriate ranks are obtained and plotted based on the desired confidence level. These
Confidence Bounds 55
points are then joined by a smooth curve to obtain the corresponding confidence bound.
In Weibull++, this non-parametric methodology is used only when plotting bounds on the mixed Weibull
distribution. Full details on this methodology can be found in Kececioglu [20]. These binomial equations can again
be transformed using the beta and F distributions, thus the name beta binomial confidence bounds.
where:
where are unknown constant parameters that need to be estimated, one can conduct an experiment
and obtain independent observations, , which correspond in the case of life data analysis to failure
times. The likelihood function is given by:
The maximum likelihood estimators (MLE) of are are obtained by maximizing These are
represented by the term in the denominator of the ratio in the likelihood ratio equation. Since the values of the
data points are known, and the values of the parameter estimates have been calculated using MLE methods, the
only unknown term in the likelihood ratio equation is the term in the numerator of the ratio. It remains to find
the values of the unknown parameter vector that satisfy the likelihood ratio equation. For distributions that have
two parameters, the values of these two parameters can be varied in order to satisfy the likelihood ratio equation. The
values of the parameters that satisfy this equation will change based on the desired confidence level but at a given
value of there is only a certain region of values for and for which the likelihood ratio equation holds true.
This region can be represented graphically as a contour plot, an example of which is given in the following graphic.
Confidence Bounds 56
The region of the contour plot essentially represents a cross-section of the likelihood function surface that satisfies
the conditions of the likelihood ratio equation.
Note on Contour Plots in Weibull++ and ALTA
Contour plots can be used for comparing data sets. Consider two data sets, one for an old product design and another
for a new design. The engineer would like to determine if the two designs are significantly different and at what
confidence. By plotting the contour plots of each data set in an overlay plot (the same distribution must be fitted to
each data set), one can determine the confidence at which the two sets are significantly different. If, for example,
there is no overlap (i.e., the two plots do not intersect) between the two 90% contours, then the two data sets are
significantly different with a 90% confidence. If the two 95% contours overlap, then the two designs are NOT
significantly different at the 95% confidence level. An example of non-intersecting contours is shown next. For
details on comparing data sets, see Comparing Life Data Sets.
Confidence Bounds 57
The task now is to find the values of the parameters and so that the equality in the likelihood ratio equation
shown above is satisfied. Unfortunately, there is no closed-form solution; therefore, these values must be arrived at
numerically. One way to do this is to hold one parameter constant and iterate on the other until an acceptable
solution is reached. This can prove to be rather tricky, since there will be two solutions for one parameter if the other
is held constant. In situations such as these, it is best to begin the iterative calculations with values close to those of
the MLE values, so as to ensure that one is not attempting to perform calculations outside of the region of the
contour plot where no solution exists.
Example 1: Likelihood Ratio Bounds on Parameters
Five units were put on a reliability test and experienced failures at 10, 20, 30, 40 and 50 hours. Assuming a Weibull
distribution, the MLE parameter estimates are calculated to be and Calculate the 90%
two-sided confidence bounds on these parameters using the likelihood ratio method.
Solution
The first step is to calculate the likelihood function for the parameter estimates:
Confidence Bounds 58
where are the original time-to-failure data points. We can now rearrange the likelihood ratio equation to the form:
Since our specified confidence level, , is 90%, we can calculate the value of the chi-squared statistic,
We then substitute this information into the equation:
The next step is to find the set of values of and that satisfy this equation, or find the values of and such that
The solution is an iterative process that requires setting the value of and finding the appropriate values of , and
vice versa. The following table gives values of based on given values of .
(Note that this plot is generated with degrees of freedom , as we are only determining bounds on one
parameter. The contour plots generated in Weibull++ are done with degrees of freedom , for use in comparing
both parameters simultaneously.) As can be determined from the table, the lowest calculated value for is 1.142,
while the highest is 3.950. These represent the two-sided 90% confidence limits on this parameter. Since solutions
for the equation do not exist for values of below 23 or above 50, these can be considered the 90% confidence
limits for this parameter. In order to obtain more accurate values for the confidence limits on , we can perform the
same procedure as before, but finding the two values of that correspond with a given value of Using this
method, we find that the 90% confidence limits on are 22.474 and 49.967, which are close to the initial estimates
of 23 and 50.
Note that the points where are maximized and minimized do not necessarily correspond with the points where
are maximized and minimized. This is due to the fact that the contour plot is not symmetrical, so that the parameters
will have their extremes at different points.
This can then be substituted into the term in the likelihood ratio equation to form a likelihood equation in terms of
and or:
where are the original time-to-failure data points. We can now rearrange the likelihood ratio equation to the form:
Since our specified confidence level, , is 90%, we can calculate the value of the chi-squared statistic,
We can now substitute this information into the equation:
Note that the likelihood value for is the same as it was for Example 1. This is because we are dealing with
the same data and parameter estimates or, in other words, the maximum value of the likelihood function did not
change. It now remains to find the values of and which satisfy this equation. This is an iterative process that
requires setting the value of and finding the appropriate values of . The following table gives the values of
based on given values of .
Confidence Bounds 61
As can be determined from the table, the lowest calculated value for is 17.389, while the highest is 41.714. These
represent the 90% two-sided confidence limits on the time at which reliability is equal to 50%.
Confidence Bounds 62
where are the original time-to-failure data points. We can now rearrange the likelihood ratio equation to the form:
Since our specified confidence level, , is 90%, we can calculate the value of the chi-squared statistic,
We can now substitute this information into the equation:
It now remains to find the values of and that satisfy this equation. This is an iterative process that requires
setting the value of and finding the appropriate values of . The following table gives the values of based on
given values of .
Confidence Bounds 63
As can be determined from the table, the lowest calculated value for is 2.38%, while the highest is 44.26%. These
represent the 90% two-sided confidence limits on the reliability at .
Confidence Bounds 64
where:
In other words, we now have the distribution of and we can now make statistical inferences on this parameter,
such as calculating probabilities. Specifically, the probability that is less than or equal to a value
can be obtained by integrating the posterior probability density function (pdf), or:
The above equation is the posterior cdf, which essentially calculates a confidence bound on the parameter, where
is the confidence level and is the confidence bound. Substituting the posterior pdf into the above
posterior cdf yields:
The only question at this point is, what do we use as a prior distribution of ? For the confidence bounds
calculation application, non-informative prior distributions are utilized. Non-informative prior distributions are
distributions that have no population basis and play a minimal role in the posterior distribution. The idea behind the
use of non-informative prior distributions is to make inferences that are not affected by external information, or
when external information is not available. In the general case of calculating confidence bounds using Bayesian
methods, the method should be independent of external information and it should only rely on the current data.
Therefore, non-informative priors are used. Specifically, the uniform distribution is used as a prior distribution for
the different parameters of the selected fitted distribution. For example, if the Weibull distribution is fitted to the
data, the prior distributions for beta and eta are assumed to be uniform. The above equation can be generalized for
any distribution having a vector of parameters yielding the general equation for calculating Bayesian confidence
bounds:
Confidence Bounds 65
where:
• is the confidence level
• is the parameter vector
• is the likelihood function
• is the prior pdf of the parameter vector
• is the range of
• is the range in which changes from till 's maximum value, or from 's minimum value till
• is a function such that if is given, then the bounds are calculated for . If is given, then the
bounds are calculated for .
If is given, then from the above equation and and for a given , the bounds on are calculated. If is
given, then from the above equation and and for a given the bounds on are calculated.
For a given reliability, the Bayesian one-sided upper bound estimate for is:
where is the posterior distribution of Time Using the above equation, we have the following:
Applying the Bayes's rule by assuming that the priors of and are independent, we then obtain the following
relationship:
and:
Using the same method for the one-sided bounds, and can be solved.
and
Confidence Bounds 67
Using the same method for one-sided bounds, and can be solved.
Create another SimuMatic folio and generate a second data using the same settings, but this time, select the RRY
analysis method on the Analysis tab. The following plot shows the result.
Confidence Bounds 69
The following plot shows the results using the MLE analysis method.
Confidence Bounds 70
The results clearly demonstrate that the median RRX estimate provides the least deviation from the truth for this
sample size and data type. However, the MLE outputs are grouped more closely together, as evidenced by the
bounds.
This experiment can be repeated in SimuMatic using multiple censoring schemes (including Type I and Type II right
censoring as well as random censoring) with various distributions. Multiple experiments can be performed with this
utility to evaluate assumptions about the appropriate parameter estimation method to use for data sets.
71
Chapter 7
where is the location parameter. Some of the characteristics of the 2-parameter exponential distribution are
discussed in Kececioglu [19]:
• The location parameter, , if positive, shifts the beginning of the distribution by a distance of to the right of the
origin, signifying that the chance failures start to occur only after hours of operation, and cannot occur before.
• The scale parameter is .
• The exponential pdf has no shape parameter, as it has only one shape.
• The distribution starts at at the level of and decreases thereafter exponentially and
monotonically as increases beyond and is convex.
• As , .
where:
= constant rate, in failures per unit of measurement, (e.g., failures per hour, per cycle, etc.)
• The distribution starts at at the level of and decreases thereafter exponentially and
monotonically as increases, and is convex.
• As , .
• The pdf can be thought of as a special case of the Weibull pdf with and .
Note that when , the MTTF is the inverse of the exponential distribution's constant failure rate. This is only
true for the exponential distribution. Most other distributions do not have a constant failure rate. Consequently, the
inverse relationship between failure rate and MTTF does not hold for these other distributions.
The Median
The median, is:
The Mode
The mode, is:
Recalling that the reliability function of a distribution is simply one minus the cdf, the reliability function of the
2-parameter exponential distribution is given by:
which says that the reliability for a mission of duration undertaken after the component or equipment has already
accumulated hours of operation from age zero is only a function of the mission duration, and not a function of the
age at the beginning of the mission. This is referred to as the memoryless property.
or:
Once again, note that the constant failure rate is a characteristic of the exponential distribution, and special cases of
other distributions only. Most other distributions have failure rates that are functions of time.
• The exponential pdf has no shape parameter, as it has only one shape.
• The exponential pdf is always convex and is stretched to the right as decreases in value.
• The value of the pdf function is always equal to the value of at (or ).
• The location parameter, , if positive, shifts the beginning of the distribution by a distance of to the right of
the origin, signifying that the chance failures start to occur only after hours of operation, and cannot occur
before this time.
• The scale parameter is .
• As , .
The Exponential Distribution 75
• The 1-parameter exponential reliability function starts at the value of 100% at , decreases thereafter
monotonically and is convex.
• The 2-parameter exponential reliability function remains at the value of 100% for up to , and
decreases thereafter monotonically and is convex.
• As , .
• The reliability for a mission duration of , or of one MTTF duration, is always equal to or
36.79%. This means that the reliability for a mission which is as long as one MTTF is relatively low and is not
recommended because only 36.8% of the missions will be completed successfully. In other words, of the
equipment undertaking such a mission, only 36.8% will survive their mission.
The Exponential Distribution 76
Probability Plotting
Estimation of the parameters for the exponential distribution via probability plotting is very similar to the process
used when dealing with the Weibull distribution. Recall, however, that the appearance of the probability plotting
paper and the methods by which the parameters are estimated vary from distribution to distribution, so there will be
some noticeable differences. In fact, due to the nature of the exponential cdf, the exponential probability plot is the
only one with a negative slope. This is because the y-axis of the exponential probability plotting paper represents the
reliability, whereas the y-axis for most of the other life distributions represents the unreliability.
This is illustrated in the process of linearizing the cdf, which is necessary to construct the exponential probability
plotting paper. For the two-parameter exponential distribution the cumulative density function is given by:
Taking the natural logarithm of both sides of the above equation yields:
or:
Now, let:
The Exponential Distribution 77
and:
Note that with the exponential probability plotting paper, the y-axis scale is logarithmic and the x-axis scale is linear.
This means that the zero value is present only on the x-axis. For , and . So if we were to
use for the y-axis, we would have to plot the point . However, since the y-axis is logarithmic, there is no
place to plot this on the exponential paper. Also, the failure rate, , is the negative of the slope of the line, but there
is an easier way to determine the value of from the probability plot, as will be illustrated in the following example.
Plotting Example
1-Parameter Exponential Probability Plot Example
6 units are put on a life test and tested to failure. The failure times are 7, 12, 19, 29, 41, and 67 hours. Estimate the
failure rate for a 1-parameter exponential distribution using the probability plotting method.
In order to plot the points for the probability plot, the appropriate reliability estimate values must be obtained. These
will be equivalent to since the y-axis represents the reliability and the values represent
unreliability estimates.
Next, these points are plotted on an exponential probability plotting paper. A sample of this type of plotting paper is
shown next, with the sample points in place. Notice how these points describe a line with a negative slope.
The Exponential Distribution 78
Once the points are plotted, draw the best possible straight line through these points. The time value at which this
line intersects with a horizontal line drawn at the 36.8% reliability mark is the mean life, and the reciprocal of this is
the failure rate . This is because at :
The following plot shows that the best-fit line through the data points crosses the line at
hours. And because hours, failures/hour.
The Exponential Distribution 79
Rank Regression on Y
Performing a rank regression on Y requires that a straight line be fitted to the set of available data points such that
the sum of the squares of the vertical deviations from the points to the line is minimized. The least squares parameter
estimation method (regression analysis) was discussed in Parameter Estimation, and the following equations for rank
regression on Y (RRY) were derived:
and:
and:
and the is estimated from the median ranks. Once and are obtained, then and can easily be obtained
from above equations. For the one-parameter exponential, equations for estimating a and b become:
The Exponential Distribution 80
RRY Example
2-Parameter Exponential RRY Example
14 units were being reliability tested and the following life test data were obtained. Assuming that the data follow a
2-parameter exponential distribution, estimate the parameters and determine the correlation coefficient, , using
rank regression on Y (RRY).
1 5
2 10
3 15
4 20
5 25
6 30
7 35
8 40
9 50
10 60
11 70
12 80
13 90
14 100
Solution
Construct the following table, as shown next.
The Exponential Distribution 81
The median rank values ( ) can be found in rank tables or they can be estimated using the Quick Statistical
Reference in Weibull++. Given the values in the table above, calculate and :
or:
and:
or:
Therefore:
and:
or:
Then:
The correlation coefficient can be estimated using equation for calculating the correlation coefficient:
The Exponential Distribution 82
This example can be repeated using Weibull++, choosing 2-parameter exponential and rank regression on Y (RRY),
as shown next.
The estimated parameters and the correlation coefficient using Weibull++ were found to be:
Please note that the user must deselect the Reset if location parameter > T1 on Exponential option on the
Calculations page of the Application Setup window.
The probability plot can be obtained simply by clicking the Plot icon.
The Exponential Distribution 83
Rank Regression on X
Similar to rank regression on Y, performing a rank regression on X requires that a straight line be fitted to a set of
data points such that the sum of the squares of the horizontal deviations from the points to the line is minimized.
Again the first task is to bring our exponential cdf function into a linear form. This step is exactly the same as in
regression on Y analysis. The deviation from the previous analysis begins on the least squares fit step, since in this
case we treat as the dependent variable and as the independent variable. The best-fitting straight line to the data,
for regression on X (see Parameter Estimation), is the straight line:
and:
where:
and:
The Exponential Distribution 84
The values of are estimated from the median ranks. Once and are obtained, solve for the unknown
value, which corresponds to:
and:
For the one-parameter exponential case, equations for estimating a and b become:
RRX Example
2-Parameter Exponential RRX Example
Using the same data set from the RRY example above and assuming a 2-parameter exponential distribution, estimate
the parameters and determine the correlation coefficient estimate, , using rank regression on X.
Solution
The table constructed for the RRY analysis applies to this example also. Using the values from this table, we get:
or:
and:
or:
Therefore:
The Exponential Distribution 85
and:
Note that the equation for regression on Y is not necessarily the same as that for the regression on X. The only time
when the two regression methods yield identical results is when the data lie perfectly on a line. If this were the case,
the correlation coefficient would be . The negative value of the correlation coefficient is due to the fact that the
slope of the exponential probability plot is negative.
This example can be repeated using Weibull++, choosing two-parameter exponential and rank regression on X
(RRX) methods for analysis, as shown below. The estimated parameters and the correlation coefficient using
Weibull++ were found to be:
The probability plot can be obtained simply by clicking the Plot icon.
The Exponential Distribution 86
MLE Example
MLE for the Exponential Distribution
Using the same data set from the RRY and RRX examples above and assuming a 2-parameter exponential
distribution, estimate the parameters using the MLE method.
Solution
In this example, we have complete data only. The partial derivative of the log-likelihood function, is given by:
Complete descriptions of the partial derivatives can be found in Appendix D. Recall that when using the MLE
method for the exponential distribution, the value of is equal to that of the first failure time. The first failure
The Exponential Distribution 87
occurred at 5 hours, thus hours Substituting the values for and we get:
or:
Using Weibull++:
Confidence Bounds
In this section, we present the methods used in the application to estimate the different types of confidence bounds
for exponentially distributed data. The complete derivations were presented in detail (for a general function) in the
chapter for Confidence Bounds. At this time we should point out that exact confidence bounds for the exponential
distribution have been derived, and exist in a closed form, utilizing the distribution. These are described in detail
in Kececioglu [20], and are covered in the section in the test design chapter. For most exponential data analyses,
Weibull++ will use the approximate confidence bounds, provided from the Fisher information matrix or the
likelihood ratio, in order to stay consistent with all of the other available distributions in the application. The
confidence bounds for the exponential distribution are discussed in more detail in the test design chapter.
For the failure rate the upper ( ) and lower ( ) bounds are estimated by Nelson [30]:
If is the confidence level, then for the two-sided bounds, and for the one-sided bounds.
The variance of is estimated from the Fisher matrix, as follows:
Bounds on Reliability
The reliability of the two-parameter exponential distribution is:
These equations hold true for the 1-parameter exponential distribution, with .
Bounds on Time
The bounds around time for a given exponential percentile, or reliability value, are estimated by first solving the
reliability equation with respect to time, or reliable life:
Bounds on Parameters
For one-parameter distributions such as the exponential, the likelihood confidence bounds are calculated by finding
values for that satisfy:
For complete data, the likelihood function for the exponential distribution is given by:
The Exponential Distribution 90
where the values represent the original time-to-failure data. For a given value of , values for can be found
which represent the maximum and minimum values that satisfy the above likelihood ratio equation. These represent
the confidence bounds for the parameters at a confidence level where for two-sided bounds and
for one-sided.
where are the original time-to-failure data points. We can now rearrange the likelihood ratio equation to the form:
Since our specified confidence level, , is 85%, we can calculate the value of the chi-squared statistic,
We can now substitute this information into the equation:
It now remains to find the values of which satisfy this equation. Since there is only one parameter, there are only
two values of that will satisfy the equation. These values represent the two-sided confidence limits of
the parameter estimate . For our problem, the confidence limits are:
This equation can now be substituted into the likelihood ratio equation to produce a likelihood equation in terms of
and
The unknown parameter depends on what type of bounds are being determined. If one is trying to determine the
bounds on time for the equation for the mean and the Bayes's rule equation for single parametera given reliability,
then is a known constant and is the unknown parameter. Conversely, if one is trying to determine the bounds on
reliability for a given time, then is a known constant and is the unknown parameter. Either way, the likelihood
ratio function can be solved for the values of interest.
Bounds on Parameters
From Confidence Bounds, we know that the posterior distribution of can be written as:
The above equation is solved w.r.t. The same method is applied for one-sided lower and two-sided bounds on
time.
The above equation is solved w.r.t. The same method can be used to calculate one-sided lower and two sided
bounds on reliability.
The Exponential Distribution 93
Grouped Data
20 units were reliability tested with the following results:
7 100
5 200
3 300
2 400
1 500
2 600
1. Assuming a 2-parameter exponential distribution, estimate the parameters by hand using the MLE analysis
method.
2. Repeat the above using Weibull++. (Enter the data as grouped data to duplicate the results.)
3. Show the Probability plot for the analysis results.
4. Show the Reliability vs. Time plot for the results.
5. Show the pdf plot for the results.
6. Show the Failure Rate vs. Time plot for the results.
7. Estimate the parameters using the rank regression on Y (RRY) analysis method (and using grouped ranks).
Solution
1. For the 2-parameter exponential distribution and for hours (first failure), the partial of the log-likelihood
function, , becomes:
2. Enter the data in a Weibull++ standard folio and calculate it as shown next.
The Exponential Distribution 94
3. On the Plot page of the folio, the exponential Probability plot will appear as shown next.
The Exponential Distribution 95
Note that, as described at the beginning of this chapter, the failure rate for the exponential distribution is constant.
Also note that the Failure Rate vs. Time plot does show values for times before the location parameter, , at 100
hours.
7. In the case of grouped data, one must be cautious when estimating the parameters using a rank regression method.
This is because the median rank values are determined from the total number of failures observed by time where
indicates the group number. In this example, the total number of groups is and the total number of units is
. Thus, the median rank values will be estimated for 20 units and for the total failed units ( ) up to
the group, for the rank value. The median ranks values can be found from rank tables or they can be estimated
using ReliaSoft's Quick Statistical Reference tool.
For example, the median rank value of the fourth group will be the rank out of a sample size of twenty units (or
81.945%).
The following table is then constructed.
or:
and:
or:
Therefore:
and:
or:
Then:
The small difference in the values from Weibull++ is due to rounding. In the application, the calculations and the
rank values are carried out up to the decimal point.
1 2 placebo
2 2 placebo
3 1 placebo
4 2 placebo
5 2 placebo
7 1 6MP
8 4 placebo
11 2 placebo
12 2 placebo
13 1 6MP
15 1 placebo
16 1 6MP
17 1 placebo
22 1 placebo
22 1 6MP
23 1 placebo
23 1 6MP
Create a new Weibull++ standard folio that's configured for grouped times-to-failure data with suspensions. In the
first column, enter the number of patients. Whenever there are uncompleted tests, enter the number of patients who
completed the test separately from the number of patients who did not (e.g., if 4 patients had symptoms return after 6
weeks and only 3 of them completed the test, then enter 1 in one row and 3 in another). In the second column enter F
if the patients completed the test and S if they didn't. In the third column enter the time, and in the fourth column
(Subset ID) specify whether the 6MP drug or a placebo was used.
The Exponential Distribution 100
Next, open the Batch Auto Run utility and select to separate the 6MP drug from the placebo, as shown next.
The Exponential Distribution 101
The software will create two data sheets, one for each subset ID, as shown next.
Calculate both data sheets using the 2-parameter exponential distribution and the MLE analysis method, then insert
an additional plot and select to show the analysis results for both data sheets on that plot, which will appear as shown
next.
The Exponential Distribution 102
103
Chapter 8
where:
and:
scale parameter, or characteristic life
shape parameter (or slope)
location parameter (or failure free life)
where
Note that some practitioners erroneously assume that is equal to the MTTF, . This is only true for the case of:
or:
The Median
The median, , of the Weibull distribution is given by:
The Mode
The mode, , is given by:
The Weibull Distribution 105
or:
These give the reliability for a new mission of duration, having already accumulated time of operation up to the
start of this new mission, and the units are checked out to assure that they will start the next mission successfully. It
is called conditional because you can calculate the reliability of a new mission based on the fact that the unit or units
already accumulated hours of operation successfully.
This is the life for which the unit/item will be functioning successfully with a reliability of . If , then
, the median life, or the life by which half of the units will survive.
where failure rate. The parameter is a pure number, (i.e., it is dimensionless). The following figure
shows the effect of different values of the shape parameter, , on the shape of the pdf. As you can see, the shape can
take on a variety of forms based on the value of .
For :
• As (or ),
• As , .
The Weibull Distribution 107
The above figure shows the effect of the value of on the cdf, as manifested in the Weibull probability plot. It is
easy to see why this parameter is sometimes referred to as the slope. Note that the models represented by the three
lines all have the same value of . The following figure shows the effects of these varied values of on the
reliability plot, which is a linear analog of the probability plot.
The Weibull Distribution 108
As indicated by above figure, populations with exhibit a failure rate that decreases with time, populations
with have a constant failure rate (consistent with the exponential distribution) and populations with
have a failure rate that increases with time. All three life stages of the bathtub curve can be modeled with the
Weibull distribution and varying values of . The Weibull failure rate for is unbounded at (or
. The failure rate, decreases thereafter monotonically and is convex, approaching the value of zero as
or . This behavior makes it suitable for representing the failure rate of units exhibiting
early-type failures, for which the failure rate decreases with age. When encountering such behavior in a
manufactured product, it may be indicative of problems in the production process, inadequate burn-in, substandard
parts and components, or problems with packaging and shipping. For , yields a constant value of or:
This makes it suitable for representing the failure rate of chance-type failures and the useful life period failure rate of
units.
For , increases as increases and becomes suitable for representing the failure rate of units exhibiting
wear-out type failures. For the curve is concave, consequently the failure rate increases at a
decreasing rate as increases.
For there emerges a straight line relationship between and , starting at a value of at ,
and increasing thereafter with a slope of . Consequently, the failure rate increases at a constant rate as increases.
Furthermore, if the slope becomes equal to 2, and when , becomes a straight line which passes
through the origin with a slope of 2. Note that at , the Weibull distribution equations reduce to that of the
Rayleigh distribution.
When the curve is convex, with its slope increasing as increases. Consequently, the failure rate
increases at an increasing rate as increases, indicating wearout life.
The Weibull Distribution 110
A change in the scale parameter has the same effect on the distribution as a change of the abscissa scale. Increasing
the value of while holding constant has the effect of stretching out the pdf. Since the area under a pdf curve is a
constant value of one, the "peak" of the pdf curve will also decrease with the increase of , as indicated in the above
figure.
• If is increased while and are kept the same, the distribution gets stretched out to the right and its height
decreases, while maintaining its shape and location.
• If is decreased while and are kept the same, the distribution gets pushed in towards the left (i.e., towards
its beginning or towards 0 or ), and its height increases.
• has the same units as , such as hours, miles, cycles, actuations, etc.
The Weibull Distribution 111
Probability Plotting
One method of calculating the parameters of the Weibull distribution is by using probability plotting. To better
illustrate this procedure, consider the following example from Kececioglu [20].
Assume that six identical units are being reliability tested at the same application and operation stress levels. All of
these units fail during the test after operating the following number of hours: 93, 34, 16, 120, 53 and 75. Estimate the
values of the parameters for a 2-parameter Weibull distribution and determine the reliability of the units at a time of
15 hours.
Solution
The steps for determining the parameters of the Weibull representing the data, using probability plotting, are outlined
in the following instructions. First, rank the times-to-failure in ascending order as shown next.
Failure Order
Time-to-failure,
Number
hours
out of Sample Size of 6
16 1
34 2
53 3
75 4
93 5
120 6
Obtain their median rank plotting positions. Median rank positions are used instead of other ranking methods
because median ranks are at a specific confidence level (50%). Median ranks can be found tabulated in many
reliability books. They can also be estimated using the following equation:
where is the failure order number and is the total sample size. The exact median ranks are found in Weibull++
by solving:
for , where is the sample size and the order number. The times-to-failure, with their corresponding median
ranks, are shown next.
The Weibull Distribution 113
16 10.91
34 26.44
53 42.14
75 57.86
93 73.56
120 89.1
On a Weibull probability paper, plot the times and their corresponding ranks. A sample of a Weibull probability
paper is given in the following figure.
The points of the data in the example are shown in the figure below. Draw the best possible straight line through
these points, as shown below, then obtain the slope of this line by drawing a line, parallel to the one just obtained,
through the slope indicator. This value is the estimate of the shape parameter , in this case .
The Weibull Distribution 114
At the ordinate point, draw a straight horizontal line until this line intersects the fitted straight line.
Draw a vertical line through this intersection until it crosses the abscissa. The value at the intersection of the abscissa
is the estimate of . For this case, hours. This is always at 63.2% since:
Now any reliability value for any mission time can be obtained. For example, the reliability for a mission of 15
hours, or any other time, can now be obtained either from the plot or analytically. To obtain the value from the plot,
draw a vertical line from the abscissa, at hours, to the fitted line. Draw a horizontal line from this intersection to the
ordinate and read , in this case . Thus, . This can also be
obtained analytically from the Weibull reliability function since the estimates of both of the parameters are known
or:
• Case 1: If the curve for MR versus is concave down and the curve for MR versus is concave up, then
there exists a such that , or has a positive value.
• Case 2: If the curves for MR versus and MR versus are both concave up, then there exists a negative
which will straighten out the curve of MR versus .
• Case 3: If neither one of the previous two cases prevails, then either reject the Weibull as one capable of
representing the data, or proceed with the multiple population (mixed Weibull) analysis. To obtain the location
parameter, :
The Weibull Distribution 115
• Subtract the same arbitrary value, , from all the times to failure and replot the data.
• If the initial curve is concave up, subtract a negative from each failure time.
• If the initial curve is concave down, subtract a positive from each failure time.
• Repeat until the data plots on an acceptable straight line.
• The value of is the subtracted (positive or negative) value that places the points in an acceptable straight
line.
The other two parameters are then obtained using the techniques previously described. Also, it is important to note
that we used the term subtract a positive or negative gamma, where subtracting a negative gamma is equivalent to
adding it. Note that when adjusting for gamma, the x-axis scale for the straight line becomes .
Rank Regression on Y
Performing rank regression on Y requires that a straight line mathematically be fitted to a set of data points such that
the sum of the squares of the vertical deviations from the points to the line is minimized. This is in essence the same
methodology as the probability plotting method, except that we use the principle of least squares to determine the
line through the points, as opposed to just eyeballing it. The first step is to bring our function into a linear form. For
the two-parameter Weibull distribution, the (cumulative density function) is:
or:
Now let:
and:
The least squares parameter estimation method (also known as regression analysis) was discussed in Parameter
Estimation, and the following equations for regression on Y were derived:
and:
and:
where = covariance of and , = standard deviation of , and = standard deviation of . The estimator
of is the sample correlation coefficient, , given by:
RRY Example
Consider the same data set from the probability plotting example given above (with six failures at 16, 34, 53, 75, 93
and 120 hours). Estimate the parameters and the correlation coefficient using rank regression on Y, assuming that the
data follow the 2-parameter Weibull distribution.
Solution
Construct a table as shown next.
Utilizing the values from the table, calculate and using the following equations:
or:
and:
The Weibull Distribution 117
or:
Therefore:
and:
or:
This example can be repeated in the Weibull++ software. The following plot shows the Weibull probability plot for
the data set (with 90% two-sided confidence bounds).
If desired, the Weibull pdf representing the data set can be written as:
or:
The Weibull Distribution 118
You can also plot this result in Weibull++, as shown next. From this point on, different results, reports and plots can
be obtained.
Rank Regression on X
Performing a rank regression on X is similar to the process for rank regression on Y, with the difference being that
the horizontal deviations from the points to the line are minimized rather than the vertical. Again, the first task is to
bring the reliability function into a linear form. This step is exactly the same as in the regression on Y analysis and
all the equations apply in this case too. The derivation from the previous analysis begins on the least squares fit part,
where in this case we treat as the dependent variable and as the independent variable. The best-fitting straight line to
the data, for regression on X (see Parameter Estimation), is the straight line:
and:
The Weibull Distribution 119
where:
and:
and the values are again obtained from the median ranks.
Once and are obtained, solve the linear equation for , which corresponds to:
and
RRX Example
Again using the same data set from the probability plotting and RRY examples (with six failures at 16, 34, 53, 75, 93
and 120 hours), calculate the parameters using rank regression on X.
Solution
The same table constructed above for the RRY example can also be applied for RRX.
Using the values from this table we get:
or:
and:
or:
Therefore:
and:
The results and the associated graph using Weibull++ are shown next. Note that the slight variation in the results is
due to the number of significant figures used in the estimation of the median ranks. Weibull++ by default uses
double precision accuracy when computing the median ranks.
The results and the associated graph for the previous example using the 3-parameter Weibull case are shown next:
The Weibull Distribution 122
MLE Example
One last time, use the same data set from the probability plotting, RRY and RRX examples (with six failures at 16,
34, 53, 75, 93 and 120 hours) and calculate the parameters using MLE.
Solution
In this case, we have non-grouped data with no suspensions or intervals, (i.e., complete data). The equations for the
partial derivatives of the log-likelihood function are derived in an appendix and given next:
And:
The Weibull Distribution 123
The results and the associated plot using Weibull++ (MLE) are shown next.
You can view the variance/covariance matrix directly by clicking the Analysis Summary table in the control panel.
Note that the decimal accuracy displayed and used is based on your individual Application Setup.
The Weibull Distribution 124
Unbiased MLE
It is well known that the MLE is biased. The biasness will affect the accuracy of reliability prediction, especially
when the number of failures are small. Weibull++ provides a simple way to correct the bias of MLE .
When there are no right censored observations in the data, the following equation provided by Hirose [39] is used to
calculated the unbiased .
For an example on how you might correct biased estimates, see also:
Unbiasing Parameters in Weibull++ [1]
and:
If is the confidence level, then for the two-sided bounds and for the one-sided bounds.
The variances and covariances of and are estimated from the inverse local Fisher matrix, as follows:
Bounds on Reliability
The bounds on reliability can easily be derived by first looking at the general extreme value distribution (EVD). Its
reliability function is given by:
By transforming and converting , , the above equation becomes the Weibull reliability
function:
with:
set:
The next step is to find the upper and lower bounds on . Using the equations derived in Confidence Bounds, the
bounds on are then estimated from Nelson [30]:
where:
or:
• For the 3-parameter case, substitute (and by definition ), instead of . (Note that this is
an approximation since it eliminates the third parameter and assumes that )
• For the 1-parameter, thus:
Also note that the time axis (x-axis) in the three-parameter Weibull plot in Weibull++ is not but . This means
that one must be cautious when obtaining confidence bounds from the plot. If one desires to estimate the confidence
bounds on reliability for a given time from the adjusted plotted line, then these bounds should be obtained for a
entry on the time axis.
The Weibull Distribution 127
Bounds on Time
The bounds around the time estimate or reliable life estimate, for a given Weibull percentile (unreliability), are
estimated by first solving the reliability equation with respect to time, as discussed in Lloyd and Lipow [24] and in
Nelson [30]:
or:
where .
The upper and lower bounds on are estimated from:
where:
or:
For complete data, the likelihood function for the Weibull distribution is given by:
For a given value of , values for and can be found which represent the maximum and minimum values that
satisfy the above equation. These represent the confidence bounds for the parameters at a confidence level , where
for two-sided bounds and for one-sided.
The Weibull Distribution 128
Similarly, the bounds on time and reliability can be found by substituting the Weibull reliability equation into the
likelihood function so that it is in terms of and time or reliability, as discussed in Confidence Bounds. The
likelihood ratio equation used to solve for bounds on time (Type 1) is:
The likelihood ratio equation used to solve for bounds on reliability (Type 2) is:
Bounds on Parameters
Bayesian Bounds use non-informative prior distributions for both parameters. From Confidence Bounds, we know
that if the prior distribution of and are independent, the posterior joint distribution of and can be written as:
Bounds on Reliability
The above equation is solved numerically for . The same method can be used to calculate the one sided lower
bounds and two-sided bounds on reliability.
Bounds on Time
From Confidence Bounds, we know that:
The above equation is solved numerically for . The same method can be applied to calculate one sided lower
bounds and two-sided bounds on time.
Bayesian-Weibull Analysis
The Bayesian methods presented next are for the 2-parameter Weibull distribution. Bayesian concepts were
introduced in Parameter Estimation. This model considers prior knowledge on the shape ( ) parameter of the
Weibull distribution when it is chosen to be fitted to a given set of data. There are many practical applications for
this model, particularly when dealing with small sample sizes and some prior knowledge for the shape parameter is
available. For example, when a test is performed, there is often a good understanding about the behavior of the
failure mode under investigation, primarily through historical data. At the same time, most reliability tests are
performed on a limited number of samples. Under these conditions, it would be very useful to use this prior
knowledge with the goal of making more accurate predictions. A common approach for such scenarios is to use the
1-parameter Weibull distribution, but this approach is too deterministic, too absolute you may say (and you would be
right). The Bayesian-Weibull model in Weibull++ (which is actually a true "WeiBayes" model, unlike the
1-parameter Weibull that is commonly referred to as such) offers an alternative to the 1-parameter Weibull, by
including the variation and uncertainty that might have been observed in the past on the shape parameter. Applying
Bayes's rule on the 2-parameter Weibull distribution and assuming the prior distributions of and are
independent, we obtain the following posterior pdf:
In this model, is assumed to follow a noninformative prior distribution with the density function . This
is called Jeffrey's prior, and is obtained by performing a logarithmic transformation on . Specifically, since is
always positive, we can assume that ln( ) follows a uniform distribution, Applying Jeffrey's rule as given in
Gelman et al. [9] which says "in general, an approximate non-informative prior is taken proportional to the square
The Weibull Distribution 130
The prior distribution of , denoted as , can be selected from the following distributions: normal, lognormal,
exponential and uniform. The procedure of performing a Bayesian-Weibull analysis is as follows:
• Collect the times-to-failure data.
• Specify a prior distribution for (the prior for is assumed to be ).
• Obtain the posterior pdf from the above equation.
In other words, a distribution (the posterior pdf) is obtained, rather than a point estimate as in classical statistics (i.e.,
as in the parameter estimation methods described previously in this chapter). Therefore, if a point estimate needs to
be reported, a point of the posterior pdf needs to be calculated. Typical points of the posterior distribution used are
the mean (expected value) or median. In Weibull++, both options are available and can be chosen from the Analysis
page, under the Results As area, as shown next.
The median points are obtained by solving the following equations for and respectively:
and:
The Weibull Distribution 131
Of course, other points of the posterior distribution can be calculated as well. For example, one may want to
calculate the 10th percentile of the joint posterior distribution (w.r.t. one of the parameters). The procedure for
obtaining other points of the posterior distribution is similar to the one for obtaining the median values, where
instead of 0.5 the percentage of interest is given. This procedure actually provides the confidence bounds on the
parameters, which in the Bayesian framework are called ‘‘Credible Bounds.‘‘ However, since the engineering
interpretation is the same, and to avoid confusion, we refer to them as confidence bounds in this reference and in
Weibull++.
where:
For the pdf of the times-to-failure, only the expected value is calculated and reported in Weibull++.
Reliability
In order to calculate the median value of the reliability function, we first need to obtain posterior pdf of the
reliability. Since is a function of , the density functions of and have the following relationship:
The median value of the reliability is obtained by solving the following equation w.r.t.
where:
Failure Rate
The failure rate at time is given by:
The Weibull Distribution 132
where:
The above equation can be solved for . The Bayesian one-sided lower bound estimate for is given by:
The above equation can be solved for . The Bayesian two-sided bounds estimate for is given by:
and:
Using the same method for one-sided bounds, and can be computed.
The Weibull Distribution 133
The above equation can be solved for . The Bayesian one-sided lower bound estimate for is given by:
or:
The above equation can be solved for . The Bayesian two-sided lower bounds estimate for is:
and:
Bayesian-Weibull Example
A manufacturer has tested prototypes of a modified product. The test was terminated at 2,000 hours, with only 2
failures observed from a sample size of 18. The following table shows the data.
1 F 1180
1 F 1842
16 S 2000
Because of the lack of failure data in the prototype testing, the manufacturer decided to use information gathered
from prior tests on this product to increase the confidence in the results of the prototype testing. This decision was
made because failure analysis indicated that the failure mode of the two failures is the same as the one that was
observed in previous tests. In other words, it is expected that the shape of the distribution (beta) hasn't changed, but
hopefully the scale (eta) has, indicating longer life. The 2-parameter Weibull distribution was used to model all prior
tests results. The estimated beta ( ) parameters of the prior test results are as follows:
The Weibull Distribution 134
1.7
2.1
2.4
3.1
3.5
Solution
First, in order to fit the data to a Bayesian-Weibull model, a prior distribution for beta needs to be determined. Based
on the beta values in the prior tests, the prior distribution for beta is found to be a lognormal distribution with
, . (The values of the parameters can be obtained by entering the beta values into a
Weibull++ standard folio and analyzing it using the lognormal distribution and the RRX analysis method.)
Next, enter the data from the prototype testing into a standard folio. On the control panel, choose the
Bayesian-Weibull > B-W Lognormal Prior distribution. Click Calculate and enter the parameters of the
lognormal distribution, as shown next.
Click OK. The result is Beta (Median) = 2.361219 and Eta (Median) = 5321.631912 (by default Weibull++ returns
the median values of the posterior distribution). Suppose that the reliability at 3,000 hours is the metric of interest in
this example. Using the QCP, the reliability is calculated to be 76.97% at 3,000 hours. The following picture depicts
the posterior pdf plot of the reliability at 3,000, with the corresponding median value as well as the 10th percentile
value. The 10th percentile constitutes the 90% lower 1-sided bound on the reliability at 3,000 hours, which is
calculated to be 50.77%.
The Weibull Distribution 135
The pdf of the times-to-failure data can be plotted in Weibull++, as shown next:
The Weibull Distribution 136
Use the QSR to calculate the value of F0.5;10;12 = 0.9886, as shown next:
Consequently:
Another method is to use the Median Ranks option directly, which yields MR(%) = 54.8305%, as shown next:
The Weibull Distribution 137
Note that the original data points, on the curved line, were adjusted by subtracting 30.92 hours to yield a straight line
as shown above.
1 F 2
2 S 3
3 F 5
4 S 7
5 F 11
6 S 13
7 S 17
8 S 19
9 F 23
10 F 29
The Weibull Distribution 140
11 S 31
12 F 37
13 S 41
14 F 43
15 S 47
16 S 53
17 F 59
18 S 61
19 S 67
Solution
In this example, we see that the number of failures is less than the number of suspensions. This is a very common
situation, since reliability tests are often terminated before all units fail due to financial or time constraints.
Furthermore, some suspensions will be recorded when a failure occurs that is not due to a legitimate failure mode,
such as operator error. In cases such as this, a suspension is recorded, since the unit under test cannot be said to have
had a legitimate failure.
Enter the data into a Weibull++ standard folio that is configured for times-to-failure data with suspensions. The folio
will appear as shown next:
We will use the 2-parameter Weibull to solve this problem. The parameters using maximum likelihood are:
The Weibull Distribution 141
Using RRX:
Using RRY:
1 30 32
2 32 35
3 35 37
4 37 40
5 42 42
6 45 45
7 50 50
8 55 55
Analyze the data using several different parameter estimation techniques and compare the results.
Solution
Enter the data into a Weibull++ standard folio that is configured for interval data. The data is entered as follows:
The Weibull Distribution 142
The plot of the MLE solution with the two-sided 90% confidence bounds is:
The Weibull Distribution 143
, ,
Computed Results in Weibull++
Weibull++ computed parameters for rank regression on X are:
, ,
The small difference between the published results and the ones obtained from Weibull++ are due to the difference
in the estimation method. In the publication the parameters were estimated using probability plotting (i.e., the fitted
line was "eye-balled"). In Weibull++, the parameters were estimated using non-linear regression (a more accurate,
mathematically fitted line). Note that γ in this example is negative. This means that the unadjusted for γ line is
concave up, as shown next.
The Weibull Distribution 144
Using this first method, enter either the screen plot or the printed plot with T = 30 hours, go up vertically to the
straight line fitted to the data, then go horizontally to the ordinate, and read off the result. A good estimate of the
unreliability is 23%. (Also, the reliability estimate is 1.0 - 0.23 = 0.77 or 77%.)
The second method involves the use of the Quick Calculation Pad (QCP).
Select the Prob. of Failure calculation option and enter 30 hours in the Mission End Time field.
The Weibull Distribution 146
Note that the results in QCP vary according to the parameter estimation method used. The above results are obtained
using RRX.
2. The conditional reliability is given by:
or:
Again, the QCP can provide this result directly and more accurately than the plot.
3. To use the QCP to solve for the longest mission that this product should undertake for a reliability of 90%, choose
Reliable Life and enter 0.9 for the required reliability. The result is 15.9933 hours.
The Weibull Distribution 147
The small difference between the published results and the ones obtained from Weibull++ is due to the difference in
the median rank values between the two (in the publication, median ranks are obtained from tables to 3 decimal
places, whereas in Weibull++ they are calculated and carried out up to the 15th decimal point).
You will also notice that in the examples that follow, a small difference may exist between the published results and
the ones obtained from Weibull++. This can be attributed to the difference between the computer numerical
precision employed by Weibull++ and the lower number of significant digits used by the original authors. In most of
these publications, no information was given as to the numerical precision used.
Suspension Data MLE Example
The Weibull Distribution 148
From Wayne Nelson, Fan Example, Applied Life Data Analysis, page 317 [30].
70 diesel engine fans accumulated 344,440 hours in service and 12 of them failed. A table of their life data is shown
next (+ denotes non-failed units or suspensions, using Dr. Nelson's nomenclature). Evaluate the parameters with their
two-sided 95% confidence bounds, using MLE for the 2-parameter Weibull distribution.
Published Results:
Weibull parameters (2P-Weibull, MLE):
Note that Nelson expresses the results as multiples of 1,000 (or = 26.297, etc.). The published results were adjusted
by this factor to correlate with Weibull++ results.
Computed Results in Weibull++
This same data set can be entered into a Weibull++ standard folio, using 2-parameter Weibull and MLE to calculate
the parameter estimates.
You can also enter the data as given in table without grouping them by opening a data sheet configured for
suspension data. Then click the Group Data icon and chose Group exactly identical values.
The data will be automatically grouped and put into a new grouped data sheet.
Weibull++ computed parameters for maximum likelihood are:
The two-sided 95% bounds on the parameters can be determined from the QCP. Calculate and then click Report to
see the results.
Published Results:
Published results (using MLE):
Note that you must select the Use True 3-P MLEoption in the Weibull++ Application Setup to replicate these
results.
3-P Probability Plot Example
Suppose we want to model a left censored, right censored, interval, and complete data set, consisting of 274 units
under test of which 185 units fail. The following table contains the data.
1 2 5 F 5
2 23 5 S 5
3 28 0 F 7
4 4 10 F 10
5 7 15 F 15
6 8 20 F 20
7 29 20 S 20
8 32 0 F 22
9 6 25 F 25
10 4 27 F 30
11 8 30 F 35
12 5 30 F 40
13 9 27 F 45
14 7 25 F 50
15 5 20 F 55
16 3 15 F 60
17 6 10 F 65
18 3 5 F 70
19 37 100 S 100
20 48 0 F 102
Solution
Since standard ranking methods for dealing with these different data types are inadequate, we will want to use the
ReliaSoft ranking method. This option is the default in Weibull++ when dealing with interval data. The filled-out
standard folio is shown next:
The Weibull Distribution 153
Using RRX:
Using RRY:
The plot with the two-sided 90% confidence bounds for the rank regression on X solution is:
The Weibull Distribution 154
References
[1] http:/ / www. weibull. com/ hotwire/ issue109/ relbasics109. htm
155
Chapter 9
where:
= mean of the normal times-to-faiure, also noted as ,
= standard deviation of the times-to-failure
It is a 2-parameter distribution with parameters (or ) and (i.e., the mean and the standard deviation,
respectively).
There is no closed-form solution for the normal reliability function. Solutions can be obtained via the use of standard
normal tables. Since the application automatically solves for the reliability, we will not discuss manual solution
methods. For interested readers, full explanations can be found in the references.
Once again, the use of standard normal tables for the calculation of the normal conditional reliability is necessary, as
there is no closed form solution.
for .
• The normal pdf has a mean, , which is equal to the median, , and also equal to the mode, , or
. This is because the normal distribution is symmetrical about its mean.
• The mean, , or the mean life or the , is also the location parameter of the normal pdf, as it locates
the pdf along the abscissa. It can assume values of .
• The normal pdf has no shape parameter. This means that the normal pdf has only one shape, the bell shape, and
this shape does not change.
• The standard deviation is also the distance between the mean and the point of inflection of the pdf, on each side
of the mean. The point of inflection is that point of the pdf where the slope changes its value from a decreasing
to an increasing one, or where the second derivative of the pdf has a value of zero.
• The normal pdf starts at with an . As increases, also increases, goes through its
point of inflection and reaches its maximum value at . Thereafter, decreases, goes through its
point of inflection, and assumes a value of at .
Weibull++ Notes on Negative Time Values
One of the disadvantages of using the normal distribution for reliability calculations is the fact that the normal
distribution starts at negative infinity. This can result in negative values for some of the results. Negative values for
time are not accepted in most of the components of Weibull++, nor are they implemented. Certain components of the
application reserve negative values for suspensions, or will not return negative results. For example, the Quick
Calculation Pad will return a null value (zero) if the result is negative. Only the Free-Form (Probit) data sheet can
accept negative values for the random variable (x-axis values).
Probability Plotting
As described before, probability plotting involves plotting the failure times and associated unreliability estimates on
specially constructed probability plotting paper. The form of this paper is based on a linearization of the cdf of the
specific distribution. For the normal distribution, the cumulative density function can be written as:
or:
where:
Now, let:
and:
The normal probability paper resulting from this linearized cdf function is shown next.
The Normal Distribution 159
Since the normal distribution is symmetrical, the area under the pdf curve from to is , as is the area from
to . Consequently, the value of is said to be the point where . This means that the
estimate of can be read from the point where the plotted line crosses the 50% unreliability line.
To determine the value of from the probability plot, it is first necessary to understand that the area under the pdf
curve that lies between one standard deviation in either direction from the mean (or two standard deviations total)
represents 68.3% of the area under the curve. This is represented graphically in the following figure.
That is: the value of is obtained by subtracting the time value where the plotted line crosses the 84.15%
unreliability line from the time value where the plotted line crosses the 15.85% unreliability line and dividing the
result by two. This process is illustrated in the following example.
These points can now be plotted on a normal probability plotting paper as shown in the next figure.
The Normal Distribution 161
Draw the best possible line through the plot points. The time values where the line intersects the 15.85%, 50%, and
84.15% unreliability values should be projected down to the abscissa, as shown in the following plot.
The estimate of is determined from the time value at the 50% unreliability level, which in this case is 100 hours.
The value of the estimator of is determined as follows:
Rank Regression on Y
Performing rank regression on Y requires that a straight line be fitted to a set of data points such that the sum of the
squares of the vertical deviations from the points to the line is minimized.
The least squares parameter estimation method (regression analysis) was discussed in Parameter Estimation, and the
following equations for regression on Y were derived:
and:
The Normal Distribution 162
In the case of the normal distribution, the equations for and are:
and:
where the values for are estimated from the median ranks. Once and are obtained, and can easily be
obtained from above equations.
The Correlation Coefficient
The estimator of the sample correlation coefficient, , is given by:
RRY Example
Normal Distribution RRY Example
14 units were reliability tested and the following life test data were obtained. Assuming the data follow a normal
distribution, estimate the parameters and determine the correlation coefficient, , using rank regression on Y.
1 5
2 10
3 15
4 20
5 25
6 30
7 35
8 40
9 50
10 60
11 70
12 80
13 90
14 100
Solution
Construct a table like the one shown next.
The Normal Distribution 163
• The median rank values ( ) can be found in rank tables, available in many statistical texts, or they can be
estimated by using the Quick Statistical Reference in Weibull++.
• The values were obtained from standardized normal distribution's area tables by entering for and getting
the corresponding value ( ). As with the median rank values, these standard normal values can be obtained
with the Quick Statistical Reference.
Given the values in the table above, calculate and using:
and:
or:
Therefore:
and:
or hours
The correlation coefficient can be estimated using:
The Normal Distribution 164
Rank Regression on X
As was mentioned previously, performing a rank regression on X requires that a straight line be fitted to a set of data
points such that the sum of the squares of the horizontal deviations from the points to the fitted line is minimized.
Again, the first task is to bring our function, the probability of failure function for normal distribution, into a linear
form. This step is exactly the same as in regression on Y analysis. All other equations apply in this case as they did
for the regression on Y. The deviation from the previous analysis begins on the least squares fit step where: in this
case, we treat as the dependent variable and as the independent variable. The best-fitting straight line for the
data, for regression on X, is the straight line:
and:
where:
The Normal Distribution 166
and:
and the values are estimated from the median ranks. Once and are obtained, solve the above linear
equation for the unknown value of which corresponds to:
and:
RRX Example
Normal Distribution RRX Example
Using the same data set from the RRY example given above, and assuming a normal distribution, estimate the
parameters and determine the correlation coefficient, , using rank regression on X.
Solution
The table constructed for the RRY analysis applies to this example also. Using the values on this table, we get:
and:
or:
Therefore:
and:
Note that the results for regression on X are not necessarily the same as the results for regression on Y. The only
time when the two regressions are the same (i.e., will yield the same equation for a line) is when the data lie perfectly
on a straight line.
The Normal Distribution 167
The plot of the Weibull++ solution for this example is shown next.
Note that denotes the expected value of X and is defined (for continuous distributions) by:
The Normal Distribution 168
It can be shown in Dudewicz and Mishra [7] and in Dietrich [5] that the MLE estimator for the mean of the normal
(and lognormal) distribution does satisfy the unbiasedness criteria, or The same is not true for the
estimate of the variance . The maximum likelihood estimate for the variance for the normal distribution is given
by:
These estimates, however, have been shown to be biased. It can be shown in Dudewicz and Mishra [7] and in
Dietrich [5] that the unbiased estimate of the variance and standard deviation for complete data is given by:
MLE Example
Normal Distribution MLE Example
Using the same data set from the RRY and RRX examples given above and assuming a normal distribution, estimate
the parameters using the MLE method.
Solution
In this example we have non-grouped data without suspensions and without interval data. The partial derivatives of
the normal log-likelihood function, are given by:
(The derivations of these equations are presented in the appendix.) Substituting the values of and solving the
above system simultaneously, we get hours hours
The Fisher matrix is:
The plot of the Weibull++ solution for this example is shown next.
The Normal Distribution 170
Confidence Bounds
The method used by the application in estimating the different types of confidence bounds for normally distributed
data is presented in this section. The complete derivations were presented in detail (for a general function) in
Confidence Bounds.
The lower and upper bounds on the mean, , are estimated from:
Since the standard deviation, , must be positive, is treated as normally distributed, and the bounds are
estimated from:
The Normal Distribution 171
If is the confidence level, then for the two-sided bounds and for the one-sided bounds.
The variances and covariances of and are estimated from the Fisher matrix, as follows:
is the log-likelihood function of the normal distribution, described in Parameter Estimation and Appendix D.
Bounds on Reliability
The reliability of the normal distribution is:
where:
or:
Bounds on Time
The bounds around time for a given normal percentile (unreliability) are estimated by first solving the reliability
equation with respect to time, as follows:
where:
and:
Bounds on Parameters
As covered in Confidence Bounds, the likelihood confidence bounds are calculated by finding values for and
that satisfy:
For complete data, the likelihood formula for the normal distribution is given by:
where the values represent the original time to failure data. For a given value of , values for and can be
found which represent the maximum and minimum values that satisfy the above likelihood ratio equation. These
represent the confidence bounds for the parameters at a confidence level , where for two-sided bounds and
for one-sided.
The Normal Distribution 173
where are the original time-to-failure data points. We can now rearrange the likelihood ratio equation to the form:
Since our specified confidence level, , is 80%, we can calculate the value of the chi-squared statistic,
We can now substitute this information into the equation:
It now remains to find the values of and which satisfy this equation. This is an iterative process that requires
setting the value of and finding the appropriate values of , and vice versa.
The following table gives the values of based on given values of .
(Note that this plot is generated with degrees of freedom , as we are only determining bounds on one
parameter. The contour plots generated in Weibull++ are done with degrees of freedom , for use in comparing
both parameters simultaneously.) As can be determined from the table, the lowest calculated value for is 7.849,
while the highest is 17.909. These represent the two-sided 80% confidence limits on this parameter. Since solutions
for the equation do not exist for values of below 22 or above 35.5, these can be considered the two-sided 80%
confidence limits for this parameter. In order to obtain more accurate values for the confidence limits on , we can
perform the same procedure as before, but finding the two values of that correspond with a given value of
Using this method, we find that the two-sided 80% confidence limits on are 21.807 and 35.793, which are close to
the initial estimates of 22 and 35.5.
where is the inverse standard normal. This equation can now be substituted into the likelihood ratio equation to
produce an equation in terms of and :
The unknown parameter depends on what type of bounds are being determined. If one is trying to determine the
bounds on time for a given reliability, then is a known constant and is the unknown parameter. Conversely, if
one is trying to determine the bounds on reliability for a given time, then is a known constant and is the
unknown parameter. The likelihood ratio equation can be used to solve the values of interest.
The Normal Distribution 175
As can be determined from the table, the lowest calculated value for is 25.046, while the highest is 39.250. These
represent the 80% confidence limits on the time at which reliability is equal to 40%.
The Normal Distribution 176
As can be determined from the table, the lowest calculated value for is 24.776%, while the highest is 68.000%.
These represent the 80% two-sided confidence limits on the reliability at .
The Normal Distribution 177
Bounds on Parameters
From Confidence Bounds, we know that the marginal posterior distribution of can be written as:
where:
The same method can be applied for one-sided lower bounds and two-sided bounds on time.
The Normal Distribution 178
The same method can be used to calculate the one-sided lower bounds and the two-sided bounds on reliability.
The following figures show the probability plot with the 90% two-sided confidence bounds and the pdf plot.
The Normal Distribution 180
The Normal Distribution 181
Both the reliability and MTTF can be easily obtained from the QCP. The QCP, with results, for both cases is shown
in the next two figures.
The Normal Distribution 182
To obtain tabulated values for the failure rate, use the Analysis Workbook or General Spreadsheet features that are
included in Weibull++. (For more information on these features, please refer to the Weibull++ User's Guide. For a
step-by-step example on creating Weibull++ reports, please see the Quick Start Guide [1]). The following worksheet
shows the mission times and the corresponding failure rates.
The Normal Distribution 183
1 F 2
2 S 3
3 F 5
4 S 7
5 F 11
6 S 13
7 S 17
8 S 19
9 F 23
10 F 29
11 S 31
12 F 37
13 S 41
14 F 43
15 S 47
16 S 53
17 F 59
18 S 61
19 S 67
Using the normal distribution and the maximum likelihood (MLE) parameter estimation method, the computed
parameters are:
If we analyze the data set with the rank regression on x (RRX) method, the computed parameters are:
1 30 32
2 32 35
3 35 37
4 37 40
5 42 42
6 45 45
7 50 50
8 55 55
This is a sequence of interval times-to-failure data. Using the normal distribution and the maximum likelihood
(MLE) parameter estimation method, the computed parameters are:
If we analyze the data set with the rank regression on y (RRY) parameter estimation method, the computed
parameters are:
The following plot shows the results if the data were analyzed using the rank regression on X (RRX) method.
The Normal Distribution 185
Grouped Data Times-to-Failure with Suspensions and Intervals (Interval, Left and Right Censored)
Data point index Number in State Last Inspection State (S or F) State End Time
1 1 10 F 10
2 1 20 S 20
3 2 0 F 30
4 2 40 F 40
5 1 50 F 50
6 1 60 S 60
7 1 70 F 70
8 2 20 F 80
9 1 10 F 85
10 1 100 F 100
Using the normal distribution and the maximum likelihood (MLE) parameter estimation method, the computed
parameters are:
The Normal Distribution 186
If we analyze the data set with the rank regression on x (RRX) method, the computed parameters are:
1 F 2
2 F 5
3 F 11
4 F 23
5 F 29
6 F 37
7 F 43
8 F 59
Using the normal distribution and the maximum likelihood (MLE) parameter estimation method, the computed
parameters are:
If we analyze the data set with the rank regression on x (RRX) method, the computed parameters are:
References
[1] http:/ / www. reliasoft. tv/ weibull/ startguide/ weibull_qsg_2. html
187
Chapter 10
where:
Substitution yields:
where:
The Lognormal Distribution 188
The mean of the natural logarithms of the times-to-failure, , in terms of and is given by:
The Median
The median of the lognormal distribution, , is discussed in Kececioglu [19]:
The Mode
The mode of the lognormal distribution, , is discussed in Kececioglu [19]:
The standard deviation of the natural logarithms of the times-to-failure, , in terms of and is given by:
or:
As with the normal distribution, there is no closed-form solution for the lognormal reliability function. Solutions can
be obtained via the use of standard normal tables. Since the application automatically solves for the reliability we
will not discuss manual solution methods. For interested readers, full explanations can be found in the references.
Once again, the use of standard normal tables is necessary to solve this equation, as no closed-form solution exists.
The Lognormal Distribution 189
As with the reliability equations, standard normal tables will be required to solve for this function.
Probability Plotting
As described before, probability plotting involves plotting the failure times and associated unreliability estimates on
specially constructed probability plotting paper. The form of this paper is based on a linearization of the cdf of the
specific distribution. For the lognormal distribution, the cumulative density function can be written as:
or:
The Lognormal Distribution 191
where:
Now, let:
and:
The normal probability paper resulting from this linearized cdf function is shown next.
The process for reading the parameter estimate values from the lognormal probability plot is very similar to the
method employed for the normal distribution (see The Normal Distribution). However, since the lognormal
distribution models the natural logarithms of the times-to-failure, the values of the parameter estimates must be read
and calculated based on a logarithmic scale, as opposed to the linear time scale as it was done with the normal
distribution. This parameter scale appears at the top of the lognormal probability plot.
The process of lognormal probability plotting is illustrated in the following example.
The Lognormal Distribution 192
Plotting Example
8 units are put on a life test and tested to failure. The failures occurred at 45, 140, 260, 500, 850, 1400, 3000, and
9000 hours. Estimate the parameters for the lognormal distribution using probability plotting.
Solution
In order to plot the points for the probability plot, the appropriate unreliability estimate values must be obtained.
These will be estimated through the use of median ranks, which can be obtained from statistical tables or the Quick
Statistical Reference in Weibull++. The following table shows the times-to-failure and the appropriate median rank
values for this example:
These points may now be plotted on normal probability plotting paper as shown in the next figure.
Draw the best possible line through the plot points. The time values where this line intersects the 15.85% and 50%
unreliability values should be projected up to the logarithmic scale, as shown in the following plot.
The Lognormal Distribution 193
The natural logarithm of the time where the fitted line intersects is equivalent to . In this case, . The
value for is equal to the difference between the natural logarithms of the times where the fitted line crosses
and At , ln . Therefore,
.
Rank Regression on Y
Performing a rank regression on Y requires that a straight line be fitted to a set of data points such that the sum of the
squares of the vertical deviations from the points to the line is minimized.
The least squares parameter estimation method, or regression analysis, was discussed in Parameter Estimation and
the following equations for regression on Y were derived, and are again applicable:
and:
and:
The Lognormal Distribution 194
where the is estimated from the median ranks. Once and are obtained, then and can easily be
obtained from the above equations.
The Correlation Coefficient
The estimator of is the sample correlation coefficient, , given by:
RRY Example
Lognormal Distribution RRY Example
14 units were reliability tested and the following life test data were obtained:
1 5
2 10
3 15
4 20
5 25
6 30
7 35
8 40
9 50
10 60
11 70
12 80
13 90
14 100
Assuming the data follow a lognormal distribution, estimate the parameters and the correlation coefficient, , using
rank regression on Y.
Solution
Construct a table like the one shown next.
The Lognormal Distribution 195
The median rank values ( ) can be found in rank tables or by using the Quick Statistical Reference in
Weibull++ .
The values were obtained from the standardized normal distribution's area tables by entering for and getting
the corresponding value ( ).
Given the values in the table above, calculate and :
or:
and:
or:
Therefore:
and:
or:
The Lognormal Distribution 196
The mean and the standard deviation of the lognormal distribution are obtained using equations in the Lognormal
Distribution Functions section above:
and:
The mean can be obtained from the QCP and both the mean and the standard deviation can be obtained from the
Function Wizard.
Rank Regression on X
Performing a rank regression on X requires that a straight line be fitted to a set of data points such that the sum of the
squares of the horizontal deviations from the points to the line is minimized.
Again, the first task is to bring our cdf function into a linear form. This step is exactly the same as in regression on Y
analysis and all the equations apply in this case too. The deviation from the previous analysis begins on the least
squares fit part, where in this case we treat as the dependent variable and as the independent variable. The
best-fitting straight line to the data, for regression on X (see Parameter Estimation), is the straight line:
and:
where:
and:
and the is estimated from the median ranks. Once and are obtained, solve the linear equation for the
unknown , which corresponds to:
and:
The correlation coefficient is evaluated as before using equation in the previous section.
RRX Example
Lognormal Distribution RRX Example
Using the same data set from the RRY example given above, and assuming a lognormal distribution, estimate the
parameters and estimate the correlation coefficient, , using rank regression on X.
Solution
The table constructed for the RRY example also applies to this example as well. Using the values in this table we
get:
or:
and:
The Lognormal Distribution 198
or:
Therefore:
and:
and:
Note that the regression on Y analysis is not necessarily the same as the regression on X. The only time when the
results of the two regression types are the same (i.e., will yield the same equation for a line) is when the data lie
perfectly on a line.
Using Weibull++ , with the Rank Regression on X option, the results are:
The Lognormal Distribution 199
MLE Example
Lognormal Distribution MLE Example
Using the same data set from the RRY and RRX examples given above and assuming a lognormal distribution,
estimate the parameters using the MLE method.
Solution In this example we have only complete data. Thus, the partials reduce to:
Substituting the values of and solving the above system simultaneously, we get:
Using the equation for mean and standard deviation in the Lognormal Distribution Functions section above, we get:
and:
Confidence Bounds
The method used by the application in estimating the different types of confidence bounds for lognormally
distributed data is presented in this section. Note that there are closed-form solutions for both the normal and
lognormal reliability that can be obtained without the use of the Fisher information matrix. However, these
closed-form solutions only apply to complete data. To achieve consistent application across all possible data types,
Weibull++ always uses the Fisher matrix in computing confidence intervals. The complete derivations were
presented in detail for a general function in Confidence Bounds. For a discussion on exact confidence bounds for the
The Lognormal Distribution 200
The lower and upper bounds on the mean, , are estimated from:
For the standard deviation, , is treated as normally distributed, and the bounds are estimated from:
If is the confidence level, then for the two-sided bounds and for the one-sided bounds.
The variances and covariances of and are estimated as follows:
Bounds on Time(Type 1)
The bounds around time for a given lognormal percentile, or unreliability, are estimated by first solving the
reliability equation with respect to time, as follows:
where:
and:
where:
or:
Bounds on Parameters
As covered in Parameter Estimation, the likelihood confidence bounds are calculated by finding values for and
that satisfy:
For complete data, the likelihood formula for the normal distribution is given by:
The Lognormal Distribution 202
where the values represent the original time-to-failure data. For a given value of , values for and can be
found which represent the maximum and minimum values that satisfy likelihood ratio equation. These represent the
confidence bounds for the parameters at a confidence level where for two-sided bounds and
for one-sided.
where are the original time-to-failure data points. We can now rearrange the likelihod ratio equation to the form:
Since our specified confidence level, , is 75%, we can calculate the value of the chi-squared statistic,
We can now substitute this information into the equation:
It now remains to find the values of and which satisfy this equation. This is an iterative process that requires
setting the value of and finding the appropriate values of , and vice versa.
The following table gives the values of based on given values of .
The Lognormal Distribution 203
(Note that this plot is generated with degrees of freedom , as we are only determining bounds on one
parameter. The contour plots generated in Weibull++ are done with degrees of freedom , for use in comparing
both parameters simultaneously.) As can be determined from the table the lowest calculated value for is 4.1145,
while the highest is 4.4708. These represent the two-sided 75% confidence limits on this parameter. Since solutions
for the equation do not exist for values of below 0.24 or above 0.48, these can be considered the two-sided 75%
confidence limits for this parameter. In order to obtain more accurate values for the confidence limits on , we can
perform the same procedure as before, but finding the two values of that correspond with a given value of
Using this method, we find that the 75% confidence limits on are 0.23405 and 0.48936, which are close to the
initial estimates of 0.24 and 0.48.
The Lognormal Distribution 204
where is the inverse standard normal. This equation can now be substituted into likelihood function to produce
a likelihood equation in terms of and :
The unknown variable depends on what type of bounds are being determined. If one is trying to determine the
bounds on time for a given reliability, then is a known constant and is the unknown variable. Conversely, if one
is trying to determine the bounds on reliability for a given time, then is a known constant and is the unknown
variable. Either way, the above equation can be used to solve the likelihood ratio equation for the values of interest.
As can be determined from the table, the lowest calculated value for is 43.634, while the highest is 66.085. These
represent the two-sided 75% confidence limits on the time at which reliability is equal to 80%.
As can be determined from the table, the lowest calculated value for is 43.444%, while the highest is 81.508%.
These represent the two-sided 75% confidence limits on the reliability at .
Bounds on Parameters
From Parameter Estimation, we know that the marginal distribution of parameter is:
where:
is , non-informative prior of .
is an uniform distribution from - to + , non-informative prior of . With the above prior distributions,
can be rewritten as:
The above equation is solved w.r.t. The same method is used to calculate the one-sided lower bounds and
two-sided bounds on Reliability.
Solution
The data points are entered into a times-to-failure data sheet. The lognormal distribution is selected under
Distributions. The Bayesian confidence bounds method only applies for the MLE analysis method, therefore,
Maximum Likelihood (MLE) is selected under Analysis Method and Use Bayesian is selected under the Confidence
Bounds Method in the Analysis tab.
The two-sided 90% Bayesian confidence bounds on the lognormal parameter are obtained using the QCP and
clicking on the Calculate Bounds button in the Parameter Bounds tab as follows:
The Lognormal Distribution 208
The Lognormal Distribution 209
1 F 2
2 F 5
3 F 11
4 F 23
5 F 29
6 F 37
7 F 43
8 F 59
Solution
Using Weibull++, the computed parameters for maximum likelihood are:
Solution
Published results (using probability plotting):
The small differences are due to the precision errors when fitting a line manually, whereas in Weibull++ the line was
fitted mathematically.
This same data set can be entered into Weibull++ by creating a data sheet capable of handling non-grouped
time-to-failure data. Since the results shown above are unbiased, the Use Unbiased Std on Normal Data option in the
User Setup must be selected in order to duplicate these results. Weibull++ computed parameters for maximum
likelihood are:
The Lognormal Distribution 211
1 1 F 22.5
2 1 F 37.5
3 1 F 46
4 1 F 48.5
5 1 F 51.5
6 1 F 53
7 1 F 54.5
8 1 F 57.5
9 1 F 66.5
10 1 F 68
11 1 F 69.5
12 1 F 76.5
13 1 F 77
14 1 F 78.5
15 1 F 80
16 1 F 81.5
17 1 F 82
18 1 F 83
19 1 F 84
20 1 F 91.5
21 1 F 93.5
22 1 F 102.5
23 1 F 107
24 1 F 108.5
25 1 F 112.5
26 1 F 113.5
27 1 F 116
28 1 F 117
29 1 F 118.5
30 1 F 119
31 1 F 120
32 1 F 122.5
33 1 F 123
34 1 F 127.5
The Lognormal Distribution 212
35 1 F 131
36 1 F 132.5
37 1 F 134
38 59 S 135
Solution
The distribution used in the publication was the base-10 lognormal. Published results (using MLE):
To replicate the published results (since Weibull++ uses a lognormal to the base ), take the base-10 logarithm of
the data and estimate the parameters using the normal distribution and MLE.
• Weibull++ computed parameters for maximum likelihood are:
1 30 32
2 32 35
3 35 37
4 37 40
5 42 42
6 45 45
7 50 50
8 55 55
The Lognormal Distribution 213
Solution
This is a sequence of interval times-to-failure where the intervals vary substantially in length. Using Weibull++, the
computed parameters for maximum likelihood are calculated to be:
Chapter 11
Statistical Background
Consider a life test of identical components. The components were placed in a test at age and were tested to
failure, with their times-to-failure recorded. Further assume that the test covered the entire lifespan of the units, and
different failure modes were observed over each region of life, namely early life (early failure mode), chance life
(chance failure mode), and wear-out life (wear-out failure mode). Also, as items failed during the test, they were
removed from the test, inspected and segregated into lots according to their failure mode. At the conclusion of the
test, there will be subpopulations of failed components. If the events of the test are now
reconstructed, it may be theorized that at age there were actually separate subpopulations in the test, each
with a different times-to-failure distribution and failure mode, even though at the subpopulations were not
physically distinguishable. The mixed Weibull methodology accomplishes this segregation based on the results of
The Mixed Weibull Distribution 215
Then:
The total number surviving by age in the mixed population is the sum of the number surviving in all
subpopulations or:
or:
This expression can also be derived by applying Bayes's theorem, as discussed in Kececioglu [20], which says that
the reliability of a component drawn at random from a mixed population composed of types of failure
subpopulations is its reliability, , given that the component is from subpopulation 1, or plus its reliability,
, given that the component is from subpopulation 2, or plus its reliability, , given that the
component is from subpopulation 3, or , and so on, plus its reliability, , given that the component is from
subpopulation , or , and:
Other functions of reliability engineering interest are found by applying the fundamentals to the above reliability
equation. For example, the probability density function can be found from:
The conditional reliability for a new mission of duration , starting this mission at age , or after having already
operated a total of hours, is given by:
and:
Regression Solution
Weibull++ utilizes a modified Levenberg-Marquardt algorithm (non-linear regression) when performing regression
analysis on a mixed Weibull distribution. The procedure is rather involved and is beyond the scope of this reference.
It is sufficient to say that the algorithm fits a curved line of the form:
where:
MLE
The same space of parameters, namely is also used under the MLE method,
using the likelihood function as given in Appendix C of this reference. Weibull++ uses the EM algorithm, short for
Expectation-Maximization algorithm, for the MLE analysis. Details on the numerical procedure are beyond the
scope of this reference.
About the Calculated Parameters
Weibull++ uses the numbers 1, 2, 3 and 4 (or first, second, third and fourth subpopulation) to identify each
subpopulation. These are just designations for each subpopulation, and they are ordered based on the value of the
scale parameter, . Since the equation used is additive or:
the order of the subpopulations which are given the designation 1, 2, 3, or 4 is of no consequence. For consistency,
the application will always return the order of the results based on the magnitude of the scale parameter.
where is obtained from the variance/covariance matrix. When using the Fisher matrix bounds method,
problems can occur on the transition points of the distribution, and in particular on the Type 1 confidence bounds
(bounds on time). The problems (i.e.. the departure from the expected monotonic behavior) occur when the transition
region between two subpopulations becomes a "saddle" (i.e., the probability line is almost parallel to the time axis on
a probability plot). In this case, the bounds on time approach infinity. This behavior is more frequently encountered
with smaller sample sizes. The physical interpretation is that there is insufficient data to support any inferences when
in this region.
This is graphically illustrated in the following figure. In this plot it can be seen that there are no data points between
the last point of the first subpopulation and the first point of the second subpopulation, thus the uncertainty is high,
as described by the mathematical model.
The Mixed Weibull Distribution 218
Beta binomial bounds can be used instead in these cases, especially when estimations are to be obtained close to
these regions.
The Mixed Weibull Distribution 219
Example
We will illustrate the mixed Weibull analysis using a Monte Carlo generated set of data. To repeat this example,
generate data from a 2-parameter Weibull distribution using the Weibull++ Monte Carlo utility. The following
figures illustrate the required steps, inputs and results.
In the Monte Carlo window, enter the values and select the options shown below for subpopulation 1.
Switch to subpopulation 2 and make the selection shown below. Click Generate.
The Mixed Weibull Distribution 221
After the data set has been generated, choose the 2 Subpop-Mixed Weibull distribution. Click Calculate.
The results for subpopulation 1 are shown next. (Note that your results could be different due to the randomness of
the simulation.)
The results for subpopulation 2 are shown next. (Note that your results could be different due to the randomness of
the simulation.)
The Mixed Weibull Distribution 223
The Weibull probability plot for this data is shown next. (Note that your results could be different due to the
randomness of the simulation.)
The Mixed Weibull Distribution 224
225
Chapter 12
where is a scale parameter, and are shape parameters and is the gamma function of x,
which is defined by:
With this version of the distribution, however, convergence problems arise that severely limit its usefulness. Even
with data sets containing 200 or more data points, the MLE methods may fail to converge. Further adding to the
confusion is the fact that distributions with widely different values of , , and may appear almost identical, as
discussed in Lawless [21]. In order to overcome these difficulties, Weibull++ uses a reparameterization with
parameters , , and , as shown in [21], where:
where
While this makes the distribution converge much more easily in computations, it does not facilitate manual
manipulation of the equation. By allowing to become negative, the pdf of the reparameterized distribution is given
by:
The Generalized Gamma Distribution 226
where:
where is the gamma function of . Note that in Weibull++ the probability plot of the generalized gamma is
created on lognormal probability paper. This means that the fitted line will not be straight unless
Owing to the complexity of the equations involved, the function will not be displayed here, but the failure rate
function for the generalized gamma distribution can be obtained merely by dividing the pdf function by the reliability
function.
The Generalized Gamma Distribution 227
• In this case, the generalized distribution has the same behavior as the Weibull for and (
and respectively).
• The exponential distribution is a special case when and .
• The lognormal distribution is a special case when .
• The gamma distribution is a special case when .
By allowing to take negative values, the generalized gamma distribution can be further extended to include
additional distributions as special cases. For example, the Fréchet distribution of maxima (also known as a reciprocal
Weibull) is a special case when .
The Generalized Gamma Distribution 228
Confidence Bounds
The only method available in Weibull++ for confidence bounds for the generalized gamma distribution is the Fisher
matrix, which is described next.
For the parameter , is treated as normally distributed, and the bounds are estimated from:
If is the confidence level, then for the two-sided bounds, and for the one-sided bounds.
The variances and covariances of and are estimated as follows:
Bounds on Reliability
The upper and lower bounds on reliability are given by:
where:
The Generalized Gamma Distribution 229
Bounds on Time
The bounds around time for a given percentile, or unreliability, are estimated by first solving the reliability equation
with respect to time. Since is a positive variable, the transformed variable is treated as normally
distributed and the bounds are estimated from:
Example
The following data set represents revolutions-to-failure (in millions) for 23 ball bearings in a fatigue test, as
discussed in Lawless [21].
When the generalized gamma distribution is fitted to this data using MLE, the following values for parameters are
obtained:
Note that for this data, the generalized gamma offers a compromise between the Weibull and the
lognormal distributions. The value of indicates that the lognormal distribution is better supported by the
data. A better assessment, however, can be made by looking at the confidence bounds on For example, the 90%
two-sided confidence bounds are:
We can then conclude that both distributions (i.e., Weibull and lognormal) are well supported by the data, with the
lognormal being the better supported of the two. In Weibull++ the generalized gamma probability is plotted on a
gamma probability paper, as shown next.
The Generalized Gamma Distribution 230
It is also important to note that, as in the case of the mixed Weibull distribution, the choice of regression analysis
(i.e., RRX or RRY) is of no consequence in the generalized gamma model because it uses non-linear regression.
231
Chapter 13
where:
and:
where , and .
• As
For :
• Gamma becomes the exponential distribution.
• As ,
The Gamma Distribution 233
• As
• The pdf decreases monotonically and is convex.
• . is constant.
• The mode does not exist.
For :
• As ,
• As
• As
• The pdf decreases monotonically and is convex.
• As increases, the pdf gets stretched out to the right and its height decreases, while maintaining its shape.
• As decreases, the pdf shifts towards the left and its height increases.
• The mode does not exist.
The Gamma Distribution 234
Confidence Bounds
The only method available in Weibull++ for confidence bounds for the gamma distribution is the Fisher matrix,
which is described next. The complete derivations were presented in detail (for a general function) in the Confidence
Bounds chapter.
Since the standard deviation, , must be positive, is treated as normally distributed and the bounds are
estimated from:
If is the confidence level, then for the two-sided bounds and for the one-sided bounds.
The variances and covariances of and are estimated from the Fisher matrix, as follows:
is the log-likelihood function of the gamma distribution, described in Parameter Estimation and Appendix D
The Gamma Distribution 235
Bounds on Reliability
The reliability of the gamma distribution is:
where:
where:
Bounds on Time
The bounds around time for a given gamma percentile (unreliability) are estimated by first solving the reliability
equation with respect to time, as follows:
where:
or:
General Example
24 units were reliability tested, and the following life test data were obtained:
Fitting the gamma distribution to this data, using maximum likelihood as the analysis method, gives the following
parameters:
Chapter 14
where:
or:
where:
• As decreases, the pdf gets pushed toward the mean, or it becomes narrower and taller.
• As increases, the pdf spreads out away from the mean, or it becomes broader and shallower.
• The scale parameter can assume values of .
• The logistic pdf starts at with an . As increases, also increases, goes through its
point of inflection and reaches its maximum value at . Thereafter, decreases, goes through its
point of inflection and assumes a value of at .
• For the pdf equals 0. The maximum value of the pdf occurs at and equals
• The point of inflection of the pdf plot is the point where the second derivative of the pdf equals zero. The
inflection point occurs at or .
The Logistic Distribution 240
• If the location parameter decreases, the reliability plot is shifted to the left. If increases, the reliability plot
is shifted to the right.
• If then . is the inflection point. If then is concave (concave down); if
then is convex (concave up). For is convex (concave up), for is concave
(concave down).
• The main difference between the normal distribution and logistic distribution lies in the tails and in the
behavior of the failure rate function. The logistic distribution has slightly longer tails compared to the normal
distribution. Also, in the upper tail of the logistic distribution, the failure rate function levels out for large
approaching 1/
• If location parameter decreases, the failure rate plot is shifted to the left. Vice versa if increases, the
failure rate plot is shifted to the right.
• always increases. For for It is always
• If increases, then increases more slowly and smoothly. The segment of time where
increases, too, whereas the region where is close to 0 or gets narrower. Conversely, if decreases, then
increases more quickly and sharply. The segment of time where decreases, too, whereas
the region where is close to 0 or gets broader.
Weibull++ Notes on Negative Time Values
One of the disadvantages of using the logistic distribution for reliability calculations is the fact that the logistic
distribution starts at negative infinity. This can result in negative values for some of the results. Negative values for
time are not accepted in most of the components of Weibull++, nor are they implemented. Certain components of the
application reserve negative values for suspensions, or will not return negative results. For example, the Quick
Calculation Pad will return a null value (zero) if the result is negative. Only the Free-Form (Probit) data sheet can
accept negative values for the random variable (x-axis values).
Then:
Now let:
and:
The logistic probability paper resulting from this linearized cdf function is shown next.
The Logistic Distribution 241
Since the logistic distribution is symmetrical, the area under the pdf curve from to is 0.5, as is the area from
to . Consequently, the value of is said to be the point where . This means that the
estimate of can be read from the point where the plotted line crosses the 50% unreliability line.
Confidence Bounds
In this section, we present the methods used in the application to estimate the different types of confidence bounds
for logistically distributed data. The complete derivations were presented in detail (for a general function) in
Confidence Bounds.
The lower and upper bounds on the scale parameter are estimated from:
If is the confidence level, then for the two-sided bounds, and for the one-sided bounds.
The variances and covariances of and are estimated from the Fisher matrix, as follows:
The Logistic Distribution 242
is the log-likelihood function of the normal distribution, described in Parameter Estimation and Appendix D.
Bounds on Reliability
The reliability of the logistic distribution is:
where:
where:
or:
Bounds on Time
The bounds around time for a given logistic percentile (unreliability) are estimated by first solving the reliability
equation with respect to time as follows:
where:
or:
General Example
The lifetime of a mechanical valve is known to follow a logistic distribution. 10 units were tested for 28 months and
the following months-to-failure data were collected.
• Determine the valve's design life if specifications call for a reliability goal of 0.90.
• The valve is to be used in a pumping device that requires 1 month of continuous operation. What is the
probability of the pump failing due to the valve?
Enter the data set in a Weibull++ standard folio, as follows:
The valve's design life, along with 90% two sided confidence bounds, can be obtained using the QCP as follows:
The Logistic Distribution 244
The probability, along with 90% two sided confidence bounds, that the pump fails due to a valve failure during the
first month is obtained as follows:
The Logistic Distribution 245
246
Chapter 15
where:
and:
where , and .
where:
Distribution Characteristics
For :
• decreases monotonically and is convex. Mode and mean do not exist.
For :
• As ,
For :
• The shape of the loglogistic distribution is very similar to that of the lognormal distribution and the Weibull
distribution.
• The pdf starts at zero, increases to its mode, and decreases thereafter.
• As increases, while is kept the same, the pdf gets stretched out to the right and its height decreases, while
maintaining its shape.
• As decreases,while is kept the same, the pdf gets pushed in towards the left and its height increases.
• increases till and decreases thereafter. is concave at first, then becomes convex.
The Loglogistic Distribution 248
Confidence Bounds
The method used by the application in estimating the different types of confidence bounds for loglogistically
distributed data is presented in this section. The complete derivations were presented in detail for a general function
in Parameter Estimation.
For paramter , is treated as normally distributed, and the bounds are estimated from:
If is the confidence level, then for the two-sided bounds, and for the one-sided bounds.
The variances and covariances of and are estimated as follows:
Bounds on Reliability
The reliability of the logistic distribution is:
where:
where:
or:
Bounds on Time
The bounds around time for a given loglogistic percentile, or unreliability, are estimated by first solving the
reliability equation with respect to time, as follows:
where:
or:
Let:
then:
where:
or:
The Loglogistic Distribution 250
General Examples
Determine the loglogistic parameter estimates for the data given in the following table.
Set up the folio for times-to-failure data that includes interval and left censored data, then enter the data. The
computed parameters for maximum likelihood are calculated to be:
Chapter 16
where:
and:
Probability Paper
The form of the Gumbel probability paper is based on a linearization of the cdf. From the unreliabililty equation, we
know:
Then:
Now let:
and:
The Gumbel probability paper resulting from this linearized cdf function is shown next.
The Gumbel/SEV Distribution 254
Confidence Bounds
This section presents the method used by the application to estimate the different types of confidence bounds for data
that follow the Gumbel distribution. The complete derivations were presented in detail (for a general function) in
Confidence Bounds. Only Fisher Matrix confidence bounds are available for the Gumbel distribution.
Since the standard deviation, , must be positive, then is treated as normally distributed, and the bounds are
estimated from:
If is the confidence level, then for the two-sided bounds, and for the one-sided bounds.
The variances and covariances of and are estimated from the Fisher matrix as follows:
The Gumbel/SEV Distribution 255
is the log-likelihood function of the Gumbel distribution, described in Parameter Estimation and Appendix D.
Bounds on Reliability
The reliability of the Gumbel distribution is given by:
where:
where:
or:
Bounds on Time
The bounds around time for a given Gumbel percentile (unreliability) are estimated by first solving the reliability
equation with respect to time, as follows:
where:
or:
Example
Verify using Monte Carlo simulation that if follows a Weibull distribution with and , then the follows
a Gumbel distribution with and .
Let us assume that follows a Weibull distribution with and . The Monte Carlo simulation
tool in Weibull++ can be used to generate a set of random numbers that follow a Weibull distribution with the
specified parameters. The following picture shows the Main tab of the Monte Carlo Data Generation utility.
On the Settings tab, set the number of points to 100 and click Generate. This creates a new data sheet in the folio
that contains random time values .
Insert a new data sheet in the folio and enter the corresponding values of the time values generated by the
Monte Carlo simulation. Delete any negative values, if there are any, because Weibull++ expects the time values to
be positive. After obtaining the values, analyze the data sheet using the Gumbel distribution and the MLE
parameter estimation method. The estimated parameters are (your results may vary due to the random numbers
generated by simulation):
Chapter 17
Note that this is similar to the Benard's approximation of the median ranks, as discussed in the Parameter
Estimation chapter. The following non-parametric analysis methods are essentially variations of this concept.
Kaplan-Meier Estimator
The Kaplan-Meier estimator, also known as the product limit estimator, can be used to calculate values for
non-parametric reliability for data sets with multiple failures and suspensions. The equation of the estimator is given
by:
where:
where:
Note that the reliability estimate is only calculated for times at which one or more failures occurred. For the sake of
calculating the value of at time values that have failures and suspensions, it is assumed that the suspensions occur
slightly after the failures, so that the suspended units are considered to be operating and included in the count of .
Non-Parametric Life Data Analysis 258
Kaplan-Meier Example
A group of 20 units are put on a life test with the following results.
Use the Kaplan-Meier estimator to determine the reliability estimates for each failure time.
Solution
Using the data and the reliability equation of the Kaplan-Meier estimator, the following table can be constructed:
As can be determined from the preceding table, the reliability estimates for the failure times are:
Non-Parametric Life Data Analysis 259
Actuarial-Simple Method
The actuarial-simple method is an easy-to-use form of non-parametric data analysis that can be used for multiple
censored data that are arranged in intervals. This method is based on calculating the number of failures in a time
interval, versus the number of operating units in that time period, . The equation for the reliability estimator for
the standard actuarial method is given by:
where:
where:
Actuarial-Simple Example
A group of 55 units are put on a life test during which the units are evaluated every 50 hours. The results are:
Solution
Non-Parametric Life Data Analysis 260
The reliability estimates can be obtained by expanding the data table to include the calculations used in
the actuarial-simple method:
As can be determined from the preceding table, the reliability estimates for the failure times are:
Actuarial-Standard Method
The actuarial-standard model is a variation of the actuarial-simple model. In the actuarial-simple method, the
suspensions in a time period or interval are assumed to occur at the end of that interval, after the failures have
occurred. The actuarial-standard model assumes that the suspensions occur in the middle of the interval, which has
the effect of reducing the number of available units in the interval by half of the suspensions in that interval or:
With this adjustment, the calculations are carried out just as they were for the actuarial-simple model or:
Non-Parametric Life Data Analysis 261
Actuarial-Standard Example
Use the data set from the Actuarial-Simple example and analyze it using the actuarial-standard method.
Solution
The solution to this example is similar to that in the Actuarial-Simple example, with the exception of the inclusion of
the term, which is used in the equation for the actuarial-standard method. Applying this equation to the data, we
can generate the following table:
As can be determined from the preceding table, the reliability estimates for the failure times are:
where:
Non-Parametric Life Data Analysis 262
where:
Once the variance has been calculated, the standard error can be determined by taking the square root of the
variance:
where:
and is the desired confidence level for the 1-sided confidence bounds.
Chapter 18
where is the total number of failure modes considered. This is the product rule for the reliability of series systems
with statistically independent components, which states that the reliability for a series system is equal to the product
of the reliability values of the components comprising the system. Do note that the above equation is the reliability
function based on any assumed life distribution. In Weibull++ this life distribution can be either the 2-parameter
Weibull, lognormal, normal or the 1-parameter exponential.
CFM Example
The following example demonstrates how you can use the reliability equation to determine the overall reliability of a
component. (This example has been abstracted from Example 15.6 from the Meeker and Escobar textbook Statistical
Methods for Reliability Data [27].)
An electronic component has two competing failure modes. One failure mode is due to random voltage spikes, which
cause failure by overloading the system. The other failure mode is due to wearout failures, which usually happen
only after the system has run for many cycles. The objective is to determine the overall reliability for the component
at 100,000 cycles.
30 units are tested, and the failure times are recorded in the following table. The failures that are due to the random
voltage spikes are denoted by a V. The failures that are due to wearout failures are denoted by a W.
Competing Failure Modes Analysis 265
The following analysis shows the data set for the wearout failure mode. Using the same analysis settings (i.e.,
Weibull distribution and MLE analysis method), the parameters are and .
The reliability for this failure mode at is .
Using the Reliability Equation to obtain the overall component reliability at 100,000 cycles, we get:
Variance/Covariance Matrix
The variances and covariances of the parameters are estimated from the inverse local Fisher matrix, as follows:
where is the log-likelihood function of the failure distribution, described in Parameter Estimation.
Bounds on Reliability
The competing failure modes reliability function is given by:
where:
• is the reliability of the mode.
• is the number of failure modes.
The upper and lower bounds on reliability are estimated using the logit transformation:
where is calculated using the reliability equation for competing failure modes. is defined by:
Competing Failure Modes Analysis 268
(If is the confidence level, then for the two-sided bounds, and for the one-sided bounds.)
The variance of is estimated by:
Thus:
where:
Bounds on Time
The bounds on time are estimate by solving the reliability equation with respect to time. From the reliabilty equation
for competing faiure modes, we have that:
where:
• is inverse function for the reliabilty equation for competing faiure modes.
and:
Then the upper and lower bounds on time are found by using the equations:
and:
is calculated using the inverse standard normal distribution and is computed as:
The k-out-of-n configuration is a special case of parallel redundancy. This type of configuration requires that at least
failure modes do not happen out of the total parallel failure modes for the product to succeed. The simplest case
of a k-out-of-n configuration is when the failure modes are independent and identical and have the same failure
distribution and uncertainties about the parameters (in other words they are derived from the same test data). In this
case, the reliability of the product with such a configuration can be evaluated using the binomial distribution, or:
In the case where the k-out-of-n failure modes are not identical, other approaches for calculating the reliability must
be used (e.g. the event space method). Discussion of these is beyond the scope of this reference. Interested readers
can consult the System Analysis Reference book.
Complex Systems
In many cases, it is not easy to recognize which components are in series and which are in parallel in a complex
system.
The previous configuration cannot be broken down into a group of series and parallel configurations. This is
primarily due to the fact that failure mode C has two paths leading away from it, whereas B and D have only one.
Several methods exist for obtaining the reliability of a complex configuration including the decomposition method,
the event space method and the path-tracing method. Discussion of these is beyond the scope of this reference.
Interested readers can consult the System Analysis Reference book.
Solution
The reliability block diagram (RBD) approach can be used to analyze the reliability of the product. But before
creating a diagram, the data sets of the failure modes need to be segregated so that each mode can be represented by
a single block in the diagram. Recall that when you analyze a particular mode, the failure times for all other
competing modes are considered to be suspensions. This captures the fact that those units operated for a period of
Competing Failure Modes Analysis 272
time without experiencing the failure mode of interest before they were removed from observation when they failed
due to another mode. We can easily perform this step via Weibull++'s Batch Auto Run utility. To do this, enter the
data from the table into a single data sheet. Choose the 2P-Weibull distribution and the MLE analysis method, and
then click the Batch Auto Run icon on the control panel. When prompted to select the subset IDs, select them all.
Click the Processing Preferences tab. In the Extraction Options area, select the second option, as shown next.
This will extract the data sets that are required for the analysis. Select the check box in the Calculation Options area
and click OK. The data sets are extracted into separate data sheets in the folio and automatically calculated.
Next, create a diagram by choosing Insert > Tools > Diagram. Add blocks by right-clicking the diagram and
choosing Add Block on the shortcut menu. When prompted to select the data sheet of the failure mode that the block
will represent, select the data sheet for mode A. Use the same approach to add the blocks that will represent failure
modes B, C , D and E. Add a connector by right-clicking the diagram sheet and choosing Connect Blocks, and then
connect the blocks in an appropriate configuration to describe the relationships between the failure modes. To insert
a node, which acts as a switch that the diagram paths move through, right-click the diagram and choose Add Node.
Specify the number of required paths in the node by double-clicking the node and entering the appropriate number
(use 2 in both nodes).
The following figure shows the completed diagram.
Competing Failure Modes Analysis 273
Click Analyze to analyze the diagram, and then use the Quick Calculation Pad (QCP) to estimate the reliability. The
estimated R(100 hours) and the 90% two-sided confidence bounds are:
References
[1] http:/ / www. reliasoft. com/ Weibull/ examples/ rc10/ index. htm
[2] http:/ / www. reliasoft. tv/ weibull/ appexamples/ weibull_app_ex_10. html
[3] http:/ / www. reliasoft. com/ BlockSim/
274
Chapter 19
At the end of the analysis period, all of the units that were shipped and have not failed in the time since shipment are
considered to be suspensions. This process is repeated for each shipment and the results tabulated for each particular
failure and suspension time prior to reliability analysis. This process may sound confusing, but it is actually just a
matter of careful bookkeeping. The following example illustrates this process.
Example
Nevada Chart Format Calculations Example
A company keeps track of its shipments and warranty returns on a month-by-month basis. The following table
records the shipments in June, July and August, and the warranty returns through September:
RETURNS
We will examine the data month by month. In June 100 units were sold, and in July 3 of these units were returned.
This gives 3 failures at one month for the June shipment, which we will denote as . Likewise, 3
failures occurred in August and 5 occurred in September for this shipment, or and .
Consequently, at the end of our three-month analysis period, there were a total of 11 failures for the 100 units
shipped in June. This means that 89 units are presumably still operating, and can be considered suspensions at three
months, or . For the shipment of 140 in July, 2 were returned the following month, or ,
and 4 more were returned the month after that, or . After two months, there are 134 (
) units from the July shipment still operating, or . For the final shipment of
150 in August, 4 fail in September, or , with the remaining 146 units being suspensions at one month,
or .
Warranty Data Analysis 276
It is now a simple matter to add up the number of failures for 1, 2, and 3 months, then add the suspensions to get our
reliability data set:
More Nevada chart format warranty analysis examples are available! See also:
Warranty Analysis Example [1] or Watch the video... [2]
Time-to-Failure Format
This format is similar to the standard folio data entry format (all number of units, failure times and suspension times
are entered by the user). The difference is that when the data is used within the context of warranty analysis, the
ability to generate forecasts is available to the user.
Example
Times-to-Failure Format Warranty Analysis
Assume that we have the following information for a given product.
Times-to-Failure Data
Number in State State F or S State End Time (Hr)
2 F 100
3 F 125
5 F 175
1500 S 200
Future Sales
Quantity In-Service Time (Hr)
500 200
400 300
100 500
Use the time-to-failure warranty analysis folio to analyze the data and generate a forecast for future returns.
Solution
Create a warranty analysis folio and select the times-to-failure format. Enter the data from the tables in the Data and
Future Sales sheets, and then analyze the data using the 2P-Weibull distribution and RRX analysis method. The
parameters are estimated to be beta = 3.199832 and eta=814.293442.
Warranty Data Analysis 277
Click the Forecast icon on the control panel. In the Forecast Setup window, set the forecast to start on the 100th
hour and set the number of forecast periods to 5. Set the increment (length of each period) to 100, as shown next.
Click OK. A Forecast sheet will be created, with the following predicted future returns.
We will use the first row to explain how the forecast for each cell is calculated. For example, there are 1,500 units
with a current age of 200 hours. The probability of failure in the next 100 hours can be calculated in the QCP, as
follows.
Warranty Data Analysis 278
Therefore, the predicted number of failures for the first 100 hours is:
This is identical to the result given in the Forecast sheet (shown in the 3rd cell in the first row) of the analysis. The
bounds and the values in other cells can be calculated similarly.
All the plots that are available for the standard folio are also available in the warranty analysis, such as the
Probability plot, Reliability plot, etc. One additional plot in warranty analysis is the Expected Failures plot, which
shows the expected number of failures over time. The following figure shows the Expected Failures plot of the
example, with confidence bounds.
Warranty Data Analysis 279
Example
Dates of Failure Warranty Analysis
Assume that a company has the following information for a product.
Sales
Quantity In-Service Date In-Service
6316 1/1/2010
8447 2/1/2010
5892 3/1/2010
596 4/1/2010
996 5/1/2010
8977 6/1/2010
2578 7/1/2010
Warranty Data Analysis 280
8318 8/1/2010
2667 9/1/2010
7452 10/1/2010
1533 11/1/2010
9393 12/1/2010
1966 1/1/2011
8960 2/1/2011
6341 3/1/2011
4005 4/1/2011
3784 5/1/2011
5426 6/1/2011
4958 7/1/2011
6981 8/1/2011
Returns
Quantity Returned Date of Return Date In-Service
2 10/29/2010 10/1/2010
1 11/13/2010 10/1/2010
2 3/15/2011 10/1/2010
5 4/10/2011 10/1/2010
1 11/13/2010 11/1/2010
2 2/19/2011 11/1/2010
1 3/11/2011 11/1/2010
2 5/18/2011 11/1/2010
1 1/9/2011 12/1/2010
2 2/13/2011 12/1/2010
1 3/2/2011 12/1/2010
1 6/7/2011 12/1/2010
1 4/28/2011 1/1/2011
2 6/15/2011 1/1/2011
3 7/15/2011 1/1/2011
1 8/10/2011 2/1/2011
1 8/12/2011 2/1/2011
1 8/14/2011 2/1/2011
Warranty Data Analysis 281
5000 9/1/2011
5000 10/1/2011
5000 11/1/2011
5000 12/1/2011
5000 1/1/2012
Using the given information to estimate the failure distribution of the product and forecast warranty returns.
Solution
Create a warranty analysis folio using the dates of failure format. Enter the data from the tables in the Sales, Returns
and Future Sales sheets. On the control panel, click the Auto-Set button to automatically set the end date to the last
day the warranty data were collected (September 14, 2011). Analyze the data using the 2P-Weibull distribution and
RRX analysis method. The parameters are estimated to be beta = 1.315379 and eta = 102,381.486165.
The warranty folio automatically converts the warranty data into a format that can be used in a Weibull++ standard
folio. To see this result, click anywhere within the Analysis Summary area of the control panel to open a report, as
shown next (showing only the first 35 rows of data). In this example, rows 23 to 60 show the time-to-failure data that
resulted from the conversion.
To generate a forecast, click the Forecast icon on the control panel. In the Forecast Setup window, set the forecast to
start on September 2011 and set the number of forecast periods to 6. Set the increment (length of each period) to 1
Month, as shown next.
Warranty Data Analysis 282
Click OK. A Forecast sheet will be created, with the predicted future returns. Note that the first forecast will start on
September 15, 2011 because the end of observation period was set to September 14, 2011.
Click the Plot icon and choose the Expected Failures plot. The plot displays the predicted number of returns for
each month, as shown next.
Warranty Data Analysis 283
Usage Format
Often, the driving factor for reliability is usage rather than time. For example, in the automotive industry, the failure
behavior in the majority of the products is mileage-dependent rather than time-dependent. The usage format allows
the user to convert shipping and warranty return data into the standard reliability data for of failures and suspensions
when the return information is based on usage rather than return dates or periods. Similar to the dates of failure
format, a failure is identified by the return number and the date of when it was put in service in order to identify
which lot the unit comes from. The date that the returned unit went into service associates the returned unit with the
lot it belonged to when it started operation. However, the return data is in terms of usage and not date of return.
Therefore the usage of the units needs to be specified as a constant usage per unit time or as a distribution. This
allows for determining the expected usage of the surviving units.
Suppose that you have been collecting sales (units in service) and returns data. For the returns data, you can
determine the number of failures and their usage by reading the odometer value, for example. Determining the
number of surviving units (suspensions) and their ages is a straightforward step. By taking the difference between
the analysis date and the date when a unit was put in service, you can determine the age of the surviving units.
What is unknown, however, is the exact usage accumulated by each surviving unit. The key part of the usage-based
warranty analysis is the determination of the usage of the surviving units based on their age. Therefore, the analyst
needs to have an idea about the usage of the product. This can be obtained, for example, from customer surveys or
by designing the products to collect usage data. For example, in automotive applications, engineers often use 12,000
miles/year as the average usage. Based on this average, the usage of an item that has been in the field for 6 months
and has not yet failed would be 6,000 miles. So to obtain the usage of a suspension based on an average usage, one
could take the time of each suspension and multiply it by this average usage. In this situation, the analysis becomes
straightforward. With the usage values and the quantities of the returned units, a failure distribution can be
constructed and subsequent warranty analysis becomes possible.
Alternatively, and more realistically, instead of using an average usage, an actual distribution that reflects the
variation in usage and customer behavior can be used. This distribution describes the usage of a unit over a certain
time period (e.g., 1 year, 1 month, etc). This probabilistic model can be used to estimate the usage for all surviving
components in service and the percentage of users running the product at different usage rates. In the automotive
example, for instance, such a distribution can be used to calculate the percentage of customers that drive 0-200
miles/month, 200-400 miles/month, etc. We can take these percentages and multiply them by the number of
suspensions to find the number of items that have been accumulating usage values in these ranges.
To proceed with applying a usage distribution, the usage distribution is divided into increments based on a specified
interval width denoted as . The usage distribution, , is divided into intervals of , , ,
etc., or , as shown in the next figure.
Warranty Data Analysis 284
The interval width should be selected such that it creates segments that are large enough to contain adequate
numbers of suspensions within the intervals. The percentage of suspensions that belong to each usage interval is
calculated as follows:
where:
where:
Note: is in the same time units as the period in which the usage distribution is defined.
For each , the usage is calculated as:
Warranty Data Analysis 285
After this step, the usage of each suspension group is estimated. This data can be combined with the failures data set,
and a failure distribution can be fitted to the combined data.
Example
Warranty Analysis Usage Format Example
Suppose that an automotive manufacturer collects the warranty returns and sales data given in the following tables.
Convert this information to life data and analyze it using the lognormal distribution.
9 Dec-09
13 Jan-10
15 Feb-10
20 Mar-10
15 Apr-10
25 May-10
19 Jun-10
16 Jul-10
20 Aug-10
19 Sep-10
25 Oct-10
30 Nov-10
1 9072 Dec-09
1 9743 Jan-10
1 6857 Feb-10
1 7651 Mar-10
1 5083 May-10
1 5990 May-10
1 7432 May-10
1 8739 May-10
1 3158 Jun-10
1 1136 Jul-10
1 4646 Aug-10
1 3965 Sep-10
1 3117 Oct-10
1 3250 Nov-10
Warranty Data Analysis 286
Solution
Create a warranty analysis folio and select the usage format. Enter the data from the tables in the Sales, Returns and
Future Sales sheets. The warranty data were collected until 12/1/2010; therefore, on the control panel, set the End
of Observation Period to that date. Set the failure distribution to Lognormal, as shown next.
In this example, the manufacturer has been documenting the mileage accumulation per year for this type of product
across the customer base in comparable regions for many years. The yearly usage has been determined to follow a
lognormal distribution with , . The Interval Width is defined to be 1,000 miles. Enter the
information about the usage distribution on the Suspensions page of the control panel, as shown next.
Warranty Data Analysis 287
Click Calculate to analyze the data set. The parameters are estimated to be:
The reliability plot (with mileage being the random variable driving reliability), along with the 90% confidence
bounds on reliability, is shown next.
Warranty Data Analysis 288
In this example, the life data set contains 14 failures and 212 suspensions spread according to the defined usage
distribution. You can display this data in a standard folio by choosing Warranty > Transfer Life Data > Transfer
Life Data to New Folio. The failures and suspensions data set, as presented in the standard folio, is shown next
(showing only the first 30 rows of data).
Warranty Data Analysis 289
To illustrate the calculations behind the results of this example, consider the 9 units that went in service on
December 2009. 1 unit failed from that group; therefore, 8 suspensions have survived from December 2009 until the
beginning of December 2010, a total of 12 months. The calculations are summarized as follows.
Warranty Data Analysis 290
The two columns on the right constitute the calculated suspension data (number of suspensions and their usage) for
the group. The calculation is then repeated for each of the remaining groups in the data set. These data are then
combined with the data about the failures to form the life data set that is used to estimate the failure distribution
model.
Warranty Prediction
Once a life data analysis has been performed on warranty data, this information can be used to predict how many
warranty returns there will be in subsequent time periods. This methodology uses the concept of conditional
reliability (see Basic Statistical Background) to calculate the probability of failure for the remaining units for each
shipment time period. This conditional probability of failure is then multiplied by the number of units at risk from
that particular shipment period that are still in the field (i.e., the suspensions) in order to predict the number of
failures or warranty returns expected for this time period. The next example illustrates this.
Example
Using the data in the following table, predict the number of warranty returns for October for each of the three
shipment periods. Use the following Weibull parameters, beta = 2.4928 and eta = 6.6951.
Warranty Data Analysis 291
RETURNS
Solution
Use the Weibull parameter estimates to determine the conditional probability of failure for each shipment time
period, and then multiply that probability with the number of units that are at risk for that period as follows. The
equation for the conditional probability of failure is given by:
For the June shipment, there are 89 units that have successfully operated until the end of September (
. The probability of one of these units failing in the next month ( is then given by:
Once the probability of failure for an additional month of operation is determined, the expected number of failed
units during the next month, from the June shipment, is the product of this probability and the number of units at risk
( or:
This is then repeated for the July shipment, where there were 134 units operating at the end of September, with an
exposure time of two months. The probability of failure in the next month is:
For the August shipment, there were 146 units operating at the end of September, with an exposure time of one
month. The probability of failure in the next month is:
Thus, the total expected returns from all shipments for the next month is the sum of the above, or 29 units. This
method can be easily repeated for different future sales periods, and utilizing projected shipments. If the user lists the
number of units that are expected be sold or shipped during future periods, then these units are added to the number
of units at risk whenever they are introduced into the field. The Generate Forecast functionality in the Weibull++
warranty analysis folio can automate this process for you.
Warranty Data Analysis 292
Example
Warranty Analysis Non-Homogeneous Data Example
A company keeps track of its production and returns. The company uses the dates of failure format to record the
data. For the product in question, three versions (A, B and C) have been produced and put in service. The in-service
data is as follows (using the Month/Day/Year date format):
The return data are as follows. Note that in order to identify which lot each unit comes from, and to be able to
compute its time-in-service, each return (failure) includes a return date, the date of when it was put in service and the
model ID.
Assuming that the given information is current as of 5/1/2006, analyze the data using the lognormal distribution and
MLE analysis method for all models (Model A, Model B, Model C), and provide a return forecast for the next ten
months.
Solution
Create a warranty analysis folio and select the dates of failure format. Enter the data from the tables in the Sales,
Returns and Future Sales sheets. On the control panel, select the Use Subsets check box, as shown next. This
allows the software to separately analyze each subset of data. Use the drop-down list to switch between subset IDs
and alter the analysis settings (use the lognormal distribution and MLE analysis method for all models).
Warranty Data Analysis 294
In the End of Observation Period field, enter 5/1/2006, and then calculate the parameters. The results are:
Note that in this example, the same distribution and analysis method were assumed for each of the product models. If
desired, different distribution types, analysis methods, confidence bounds methods, etc., can be assumed for each
IDs.
To obtain the expected failures for the next 10 months, click the Generate Forecast icon. In the Forecast Setup
window, set the forecast to start on May 2, 2006 and set the number of forecast periods to 10. Set the increment
(length of each period) to 1 Month, as shown next.
Warranty Data Analysis 295
Click OK. A Forecast sheet will be created, with the predicted future returns. The following figure shows part of the
Forecast sheet.
To view a summary of the analysis, click the Show Analysis Summary (...) button. The following figure shows the
summary of the forecasted returns.
Warranty Data Analysis 296
Click the Plot icon and choose the Expected Failures plot. The plot displays the predicted number of returns for
each month, as shown next.
Warranty Data Analysis 297
where is the estimated number of failures based on the estimated distribution parameters for the sales period
and the return period , which is calculated using the equation for the conditional probability, and is the actual
number of failure for the sales period and the return period .
Since we are assuming that the model is accurate, should follow a normal distribution with mean value of zero
and a standard deviation , where:
The estimated standard deviation of the prediction errors can then be calculated by:
• For an entire sales period , abnormality is detected if where is the total number of
Example
Example Using SPC for Warranty Analysis Data
Using the data from the following table, the expected returns for each sales period can be obtained using conditional
reliability concepts, as given in the conditional probability equation.
RETURNS
For example, for the month of September, the expected return number is given by:
The actual number of returns during this period is five; thus, the prediction error for this period is:
This can then be repeated for each cell, yielding the following table for :
Warranty Data Analysis 299
The values, for each cell, are given in the following table.
If the critical value is set at and the caution value is set at , then the critical and caution
values will be:
If we consider the sales periods as the basis for outlier detection, then after comparing the above table to the sum of
values for each sales period, we find that all the sales values do not exceed the critical and caution limits.
For example, the total value of the sale month of July is 0.6085. Its degrees of freedom is 2, so the corresponding
caution and critical values are 4.6052 and 9.2103 respectively. Both values are larger than 0.6085, so the return
numbers of the July sales period do not deviate (based on the chosen significance) from the model's predictions.
If we consider returns periods as the basis for outliers detection, then after comparing the above table to the sum of
values for each return period, we find that all the return values do not exceed the critical and caution limits.
For example, the total value of the sale month of August is 3.7157. Its degree of freedom is 3, so the
corresponding caution and critical values are 6.2514 and 11.3449 respectively. Both values are larger than 3.7157, so
the return numbers for the June return period do not deviate from the model's predictions.
This analysis can be automatically performed in Weibull++ by entering the alpha values in the Statistical Process
Control page of the control panel and selecting which period to color code, as shown next.
Warranty Data Analysis 300
To view the table of chi-squared values ( or values), click the Show Results (...) button.
Weibull++ automatically color codes SPC results for easy visualization in the returns data sheet. By default, the
green color means that the return number is normal; the yellow color indicates that the return number is larger than
the caution threshold but smaller than the critical value; the red color means that the return is abnormal, meaning that
the return number is either too big or too small compared to the predicted value.
Warranty Data Analysis 301
In this example, all the cells are coded in green for both analyses (i.e., by sales periods or by return periods),
indicating that all returns fall within the caution and critical limits (i.e., nothing abnormal). Another way to visualize
this is by using a Chi-Squared plot for the sales period and return period, as shown next.
Warranty Data Analysis 302
Example
Using Subset IDs with Statistical Process Control
A manufacturer wants to monitor and analyze the warranty returns for a particular product. They collected the
following sales and return data.
Warranty Data Analysis 303
Solution
Analyze the data using the two-parameter Weibull distribution and the MLE analysis method. The parameters are
estimated to be:
To analyze the warranty returns, select the check box in the Statistical Process Control page of the control panel
and set the alpha values to 0.01 for the Critical Value and 0.1 for the Caution Value. Select to color code the results
By sales period. The following figure shows the analysis settings and results of the analysis.
As you can see, the November 04 and March 05 sales periods are colored in yellow indicating that they are outlier
sales periods, while the rest are green. One suspected reason for the variation may be the material used in production
during these periods. Further analysis confirmed that for these periods, the material was acquired from a different
supplier. This implies that the units are not homogenous, and that there are different sub-populations present in the
field population.
Categorized each shipment (using the Subset ID column) based on their material supplier, as shown next. On the
control panel, select the Use Subsets check box. Perform the analysis again using the two-parameter Weibull
distribution and the MLE analysis method for both sub-populations.
Warranty Data Analysis 304
This analysis uncovered different sub-populations in the data set. Note that if the analysis were performed on the
failure and suspension times in a regular standard folio using the mixed Weibull distribution, one would not be able
to detect which units fall into which sub-population.
References
[1] http:/ / www. reliasoft. com/ Weibull/ examples/ rc5/ index. htm
[2] http:/ / www. reliasoft. tv/ weibull/ appexamples/ weibull_app_ex_5. html
305
Chapter 20
The non-parametric model for a population of units is described as the population of cumulative history functions
(curves). It is the population of all staircase functions of every unit in the population. At age t, the units have a
distribution of their cumulative number of events. That is, a fraction of the population has accumulated 0
recurrences, another fraction has accumulated 1 recurrence, another fraction has accumulated 2 recurrences, etc. This
distribution differs at different ages , and has a mean called the mean cumulative function (MCF). The
is the point-wise average of all population cumulative history functions (see figure below).
For the case of uncensored data, the mean cumulative function values at different recurrence ages are
estimated by calculating the average of the cumulative number of recurrences of events for each unit in the
Recurrent Event Data Analysis 307
population at . When the histories are censored, the following steps are applied.
1st Step - Order all ages:
Order all recurrence and censoring ages from smallest to largest. If a recurrence age for a unit is the same as its
censoring (suspension) age, then the recurrence age goes first. If multiple units have a common recurrence or
censoring age, then these units could be put in a certain order or be sorted randomly.
2nd Step - Calculate the number, , of units that passed through age :
is the total number of units and at the first observed age which could be a recurrence or suspension.
3rd Step - Calculate the MCF estimate, M*(t):
For each sample recurrence age , calculate the mean cumulative function estimate as follows
where is defined in the equation of the survivals, is the set of the units that have not been suspended by and
is defined as follows:
In case there are multiple events at the same time , is calculated sequentially for each event. For each event,
only one can take value of 1. Once all the events at are calculated, the final calculated MCF and its variance
are the values for time . This is illustrated in the following example.
Using the MCF variance equation, the following table of variance values can be obtained:
ID Months State
1 5 F 5
2 6 F 5
1 10 F 5
3 12 F 5
2 13 F 5
4 13 F 5
1 15 F 5
4 15 F 5
5 16 F 5
2 17 F 5
1 17 S 4
2 19 S 3
3 20 F 3
5 22 F 3
4 24 S 2
3 25 F 2
5 25 F 2
3 26 S 1
Recurrent Event Data Analysis 309
5 28 S 0
Using the equation for the MCF bounds and for a 95% confidence level, the confidence bounds can
be obtained as follows:
The analysis presented in this example can be performed automatically in Weibull++'s non-parametric RDA folio, as
shown next.
Recurrent Event Data Analysis 310
Note: In the folio above, the refers to failures and refers to suspensions (or censoring ages). The results, with
calculated MCF values and upper and lower 95% confidence limits, are shown next along with the graphical plot.
Recurrent Event Data Analysis 311
let represent the time between failures ( . Assume that after each event, actions are
taken to improve the system performance. Let be the action effectiveness factor. There are two GRP models:
Type I:
Type II:
where is the virtual age of the system right after th repair. The Type I model assumes that the th repair cannot
remove the damage incurred before the th repair. It can only reduce the additional age to . The Type
II model assumes that at the th repair, the virtual age has been accumulated to . The th repair will
remove the cumulative damage from both current and previous failures by reducing the virtual age to
.
The power law function is used to model the rate of recurrence, which is:
Recurrent Event Data Analysis 312
MLE method is used to estimate the model parameters. The log likelihood function is discussed in Mettas and Zhao
[28]:
where is the total number of events during the entire observation period. is the stop time of the observation.
if the observation stops right after the last event.
Confidence Bounds
In general, in order to obtain the virtual age, the exact occurrence time of each event (failure) should be available
(see equations for Type I and Type II models). However, the times are unknown until the corresponding events
occur. For this reason, there are no closed-form expressions for total failure number and failure intensity, which are
functions of failure times and virtual age. Therefore, in Weibull++, a Monte Carlo simulation is used to predict
values of virtual time, failure number, MTBF and failure rate. The approximate confidence bounds obtained from
simulation are provided. The uncertainty of model parameters is also considered in the bounds.
The first term accounts for the uncertainty of the parameter estimation. The second term considers the uncertainty
caused by the renewal process even when model parameters are fixed. However, unless ,
cannot be calculated because cannot be expressed as a closed-form function of
, and . In order to consider the uncertainty of the parameter estimation, is
approximated by:
By conducting this approximation, the uncertainty of and are considered. The value of and the value of the
second term in the equation for the variance of number of failures are obtained through the Monte Carlo simulation
using parameters which are the ML estimators. The same simulation is used to estimate the cumulative
number of failures .
Once the variance and the expected value of have been obtained, the bounds can be calculated by assuming
that is lognormally distributed as:
The upper and lower bounds for a given confidence level can be calculated by:
Recurrent Event Data Analysis 313
In Weibull++, the is the smaller of the upper bounds obtained from lognormal and normal distribution
appoximation. The is set to the largest of the lower bounds obtained from lognormal and normal distribution
appoximation. This combined method can prevent the out-of-range values of bounds for some small values.
where is the virtual age at time . When it is obtained from simulation. When , from model
Type I and Type II.
The variance of instantaneous failure intensity can be calculated by:
The expected value and variance of are obtained from the Monte Carlo simulation with parameters
Because of the simulation accuracy and the convergence problem in calculation of and
can be a negative value at some time points. When this case happens, the bounds of
instantaneous failure intensity are not provided.
Once the variance and the expected value of are obtained, the bounds can be calculated by assuming that
is lognormally distributed as:
The upper and lower bounds for a given confidence level can be calculated by:
In Weibull++, is set to the smaller of the two upper bounds obtained from the above lognormal and normal
distribution appoximation. is set to the largest of the two lower bounds obtained from the above lognormal
and normal distribution appoximation. This combination method can prevent the out of range values of bounds when
values are small.
For a given time , the expected value of cumulative MTBF is:
The upper and lower bounds can be easily obtained from the corresponding bounds of :
is the virtual age corresponding to time . The expected value and the variance of are obtained from Monte
Carlo simulation. The variance of the conditional reliability is:
Because of the simulation accuracy and the convergence problem in calculation of and
can be a negative value at some time points. When this case happens, the bounds are not
provided.
The bounds are based on:
The smaller of the two upper bounds will be the final upper bound and the larger of the two lower bounds will be the
final lower bound.
Recurrent Event Data Analysis 315
1. Estimate the GRP model parameters using the Type I virtual age option.
2. Plot the failure number and instantaneous failure intensity vs. time with 90% two-sided confidence bounds.
3. Plot the conditional reliability vs. time with 90% two-sided confidence bounds. The mission start time is 40 and
mission time is varying.
4. Using the QCP, calculate the expected failure number and expected instantaneous failure intensity by time 1800.
Solution
Enter the data into a parametric RDA folio in Weibull++. On the control panel, select the 3 parameters option and
the Type I setting. Keep the default simulation settings. Click Calculate.
4. Using the QCP, the failure number and instantaneous failure intensity are:
Recurrent Event Data Analysis 318
References
[1] http:/ / www. reliasoft. com/ Weibull/ examples/ rc8/ index. htm
[2] http:/ / www. reliasoft. tv/ weibull/ appexamples/ weibull_app_ex_8. html
319
Chapter 21
• Lloyd-Lipow:
where represents the performance, represents time, and and are model parameters to be solved for.
Once the model parameters , and are estimated for each sample , a time can be extrapolated, which
corresponds to the defined level of failure . The computed values can now be used as our times-to-failure for
subsequent life data analysis. As with any sort of extrapolation, one must be careful not to extrapolate too far beyond
the actual range of data in order to avoid large inaccuracies (modeling errors).
Degradation Data Analysis 321
Example
Crack Propagation Example (Point Estimation)
Five turbine blades are tested for crack propagation. The test units are cyclically stressed and inspected every
100,000 cycles for crack length. Failure is defined as a crack of length 30mm or greater. The following table shows
the test results for the five units at each cycle:
The first step is to solve the equation for and for each of the test units. Using regression analysis,
the values for each of the test units are:
Substituting the values into the underlying exponential model, solve for or:
Using the values of and , with , the resulting time at which the crack length reaches 30mm can then
found for each sample:
These times-to-failure can now be analyzed using traditional life data analysis to obtain metrics such as the
probability of failure, B10 life, mean life, etc. This analysis can be automatically performed in the Weibull++
degradation analysis folio.
The variance and covariance of the model parameters are calculated from using Least Squares Estimation. The
details of the calculation are not given here.
The 2-sided upper and lower bounds for the predicted failure time, with a confidence level of are:
Example
Crack Propagation Example (Extrapolated Intervals)
Using the same data set from the previous example, predict the interval failure times for the turbine blades.
Solution
In the Weibull++ degradation analysis folio, select the Use extrapolated intervals check box, as shown next.
Use the exponential degradation model for the degradation analysis, and the Weibull distribution parameters with
MLE for the life data analysis. The following report shows the estimated degradation model parameters.
Degradation Data Analysis 323
• Linear:
• Exponential:
• Power:
• Logarithm:
• Lloyd-Lipow:
Degradation Data Analysis 325
The distribution and degradation models parameters are then calculated using Maximum Likelihood Estimation
(MLE).
For example, if a normal distribution is used to represent the degradation measurement and a linear degradation
model is assumed, then the standard deviation, , will be assumed constant with time, and the mean, , will be:
Given the CDF, the parameters , and are estimated using Maximum Likelihood Estimation.
Assuming that the requirement is that the measurement needs to be greater than a critical degradation value for
the product to fail, the probability of failure at time will be:
It should be noted that the failure threshold could be specified as a degradation measurement less than (for the case
of decreasing degradation) or greater than (for the case of increasing degradation) a critical degradation value.
The relationship between the distribution of degradation measurement and the distribution of failure time is
illustrated in the following plot.
Degradation Data Analysis 326
Example
A company has been collecting degradation data over a period of 4 years with the purpose of calculating reliability
after 5 years. With the degradation measurement decreasing with time, failure is defined as a measurement of 150 or
below. In order to obtain such measurements, the unit has to be destroyed and therefore removed from the
population.
The following table gives the degradation measurements over 4 years.
Do the following:
1. Estimate parameters using the Linear degradation model and the 2-Parameter Weibull distribution
2. Plot the degradation curve vs. time
3. Calculate the reliability at 5 years.
Solution
Enter the data into a destructive degradation folio in Weibull++. Select Linear under the Degradation Model and
2P-Weibull under the Measurement Distribution. Set the critical degradation to 150. Click Calculate.
3. Using the QCP, the reliability at 5 years, or in other words the probability that the degradation measurement
will be less than 150 at 5 years, is:
Degradation Data Analysis 328
References
[1] http:/ / www. reliasoft. com/ Weibull/ examples/ rc4/ index. htm
[2] http:/ / www. reliasoft. tv/ weibull/ appexamples/ weibull_app_ex_4
329
Chapter 22
Cumulative Binomial
This methodology requires the use of the cumulative binomial distribution in addition to the assumed distribution of
the product's lifetimes. Not only does the life distribution of the product need to be assumed beforehand, but a
reasonable assumption of the distribution's shape parameter must be provided as well. Additional information that
must be supplied includes: a) the reliability to be demonstrated, b) the confidence level at which the demonstration
takes place, c) the acceptable number of failures and d) either the number of available units or the amount of
available test time. The output of this analysis can be the amount of time required to test the available units or the
required number of units that need to be tested during the available test time. Usually the engineer designing the test
will have to study the financial trade-offs between the number of units and the amount of test time needed to
demonstrate the desired goal. In cases like this, it is useful to have a "carpet plot" that shows the possibilities of how
a certain specification can be met.
Reliability Test Design 330
where:
• is the time at which the demonstrated reliability is specified.
• is the shape parameter.
• is the scale parameter.
Since required inputs to the process include , and , the value of the scale parameter can be
backed out of the reliability equation of the assumed distribution, and will be used in the calculation of another
reliability value, , which is the reliability that is going to be incorporated into the actual test calculation.
How this calculation is performed depends on whether one is attempting to solve for the number of units to be tested
in an available amount of time, or attempting to determine how long to test an available number of test units.
Determining Units for Available Test Time
If one knows that the test is to last a certain amount of time, , the number of units that must be tested to
demonstrate the specification must be determined. The first step in accomplishing this involves calculating the
value.
This should be a simple procedure since:
and , and are already known, and it is just a matter of plugging these values into the appropriate
reliability equation.
We now incorporate a form of the cumulative binomial distribution in order to solve for the required number of
units. This form of the cumulative binomial appears as:
where:
• = the required confidence level
• = the allowable number of failures
• = the total number of units on test
• = the reliability on test
Since and are required inputs to the process and has already been calculated, it merely remains to
solve the cumulative binomial equation for , the number of units that need to be tested.
Determining Test Time for Available Units
The way that one determines the test time for the available number of test units is quite similar to the process
described previously. In this case, one knows beforehand the number of units, , the number of allowable failures,
, and the confidence level, . With this information, the next step involves solving the binomial equation for
. With this value known, one can use the appropriate reliability equation to back out the value of ,
since , and , and have already been calculated or specified.
Reliability Test Design 331
Example
In this example, we will use the parametric binomial method to design a test to demonstrate a reliability of 90% at
hours with a 95% confidence if no failure occur during the test. We will assume a Weibull
distribution with a shape parameter .
Determining Units for Available Test Time
In the above scenario, we know that we have the testing facilities available for hours. We must now
determine the number of units to test for this amount of time with no failures in order to have demonstrated our
reliability goal. The first step is to determine the Weibull scale parameter, . The Weibull reliability equation is:
Since we know the values of , and , we can substitute these in the equation and solve for :
The last step is to substitute the appropriate values into the cumulative binomial equation, which for the Weibull
distribution appears as:
The values of , , , and have already been calculated or specified, so it merely remains to solve the
equation for . This value is , or units, since the fractional value must be rounded up to the
next integer value. This example solved in Weibull++ is shown next.
Reliability Test Design 332
Example
In this example, we will use the parametric binomial method to design a test that will demonstrate
hours with a 95% confidence if no failure occur during the test . We will assume a Weibull distribution with a
shape parameter . We want to determine the number of units to test for hours to demonstrate
this goal.
The first step in this case involves determining the value of the scale parameter from the equation. The
equation for the for the Weibull distribution is:
Since and have been specified, it is a relatively simple matter to calculate . From this point
on, the procedure is the same as the reliability demonstration example. Next, the value of is calculated as:
Reliability Test Design 333
The last step is to substitute the appropriate values into the cumulative binomial equation. The values of ,
, , and have already been calculated or specified, so it merely remains to solve the binomial equation
for . The value is calculated as or units, since the fractional value must be rounded up to the
next integer value. This example solved in Weibull++ is shown next.
The procedure for determining the required test time proceeds in the same manner, determining from the
equation, and following the previously described methodology to determine from the binomial equation with
Weibull distribution.
Non-Parametric Binomial
The binomial equation can also be used for non-parametric demonstration test design. There is no time value
associated with this methodology, so one must assume that the value of is associated with the amount of
time for which the units were tested.
In other words, in cases where the available test time is equal to the demonstration time, the following
non-parametric binomial equation is widely used in practice:
where is the confidence level, is the number of failures, is the sample size and is the demonstrated
reliability. Given any three of them, the remaining one can be solved for.
Non-parametric demonstration test design is also often used for one shot devices where the reliability is not related
to time. In this case, can be simply written as .
Reliability Test Design 334
Example
A reliability engineer wants to design a zero-failure demonstration test in order to demonstrate a reliability of 80% at
a 90% confidence level. Use the non-parametric binomial method to determine the required sample size.
Solution
By substituting (since it a zero-failure test) the non-parametric binomial equation becomes:
So now the required sample size can be easily solved for any required reliability and confidence level. The result of
this test design was obtained using Weibull++ and is:
The result shows that 11 samples are needed. Note that the time value shown in the above figure is chance indicative
and not part of the test design (the "Test time per unit" that was input will be the same as the "Demonstrated at time"
value for the results). If those 11 samples are run for the required demonstration time and no failures are observed,
then a reliability of 80% with a 90% confidence level has been demonstrated. If the reliability of the system is less
than or equal to 80%, the chance of passing this test is 1-CL = 0.1, which is the Type II error. Therefore, the
non-parametric binomial equation determines the sample size by controlling for the Type II error.
If 11 samples are used and one failure is observed by the end of the test, then the demonstrated reliability will be less
than required. The demonstrated reliability is 68.98% as shown below.
Reliability Test Design 335
where is the number of units on test and is the test time. The chi-squared equation for test time is:
where:
and substituted into the chi-squared equation for developing a test that demonstrates reliability at a given time, rather
than :
Reliability Test Design 336
Example
In this example, we will use the exponential chi-squared method to design a test that will demonstrate a reliability of
85% at hours with a 90% confidence (or ) if no more than 2 failures occur during the
test ( ). The chi-squared value can be determined from tables or the Quick Statistical Reference (QSR) tool in
Weibull++. In this example, the value is calculated as:
This means that 16,374 hours of total test time needs to be accumulated with no more than two failures in order to
demonstrate the specified reliability.
This example solved in Weibull++ is shown next.
Given the test time, one can now solve for the number of units using the chi-squared equation. Similarly, if the
number of units is given, one can determine the test time from the chi-squared equation for exponential test design.
where is the incomplete beta function. If and are known, then any quantity of interest can be
calculated using the remaining three. The next two examples demonstrate how to calculate and
depending on the type of prior information available.
Reliability Test Design 337
These approximate values of the expected value and variance of the prior system reliability can then be used to
estimate the values of and , assuming that the prior reliability is a beta-distributed random variable. The values
of and are calculated as:
With and known, the above beta distribution equation can now be used to calculate a quantity of interest.
Example
You can use the non-parametric Bayesian method to design a test using prior knowledge about a system's reliability.
For example, suppose you wanted to know the reliability of a system and you had the following prior knowledge of
the system:
• Lowest possible reliability: a = 0.8
• Most likely reliability: b = 0.85
• Highest possible reliability: c = 0.97
This information can be used to approximate the expected value and the variance of the prior system reliability.
These approximations of the expected value and variance of the prior system reliability can then be used to estimate
and used in the beta distribution for the system reliability, as given next:
With and known, any single value of the four quantities system reliability R, confidence level CL, number of
units n, or number of failures r can be calculated from the other three using the beta distribution function:
Finally, from this posterior distribution, the system reliability R at a confidence level of CL=0.9 is solved as:
Finally, from this posterior distribution, the corresponding confidence level for reliability R=0.85 is:
For each subsystem i, from the beta distribution, we can calculate the expected value and the variance of the
subsystem’s reliability , as discussed in Guo [38]:
Assuming that all the subsystems are in a series reliability-wise configuration, the expected value and variance of the
system’s reliability can then be calculated as per Guo [38]:
With the above prior information on the expected value and variance of the system reliability, all the calculations can
now be calculated as before.
Example
You can use the non-parametric Bayesian method to design a test for a system using information from tests on its
subsystems. For example, suppose a system of interest is composed of three subsystems A, B and C -- with prior
information from tests of these subsystems given in the table below.
A 20 0
B 30 1
C 100 4
This data can be used to calculate the expected value and variance of the reliability for each subsystem.
A 0.952380952 0.002061
B 0.935483871 0.001886
C 0.95049505 0.000461
These values can then be used to find the prior system reliability and its variance:
From the above two values, the parameters of the prior distribution of the system reliability can be calculated by:
With this prior distribution, we now can design a system reliability demonstration test by calculating system
reliability R, confidence level CL, number of units n or number of failures r, as needed.
Solve for Sample Size n
Given the above subsystem test information, in order to demonstrate the system reliability of 0.9 at a confidence
level of 0.8, how many samples are needed in the test? Assume the allowed number of failures is 1.
Using Weibull++, the results are given in the figure below. The result shows that at least 49 test units are needed.
Reliability Test Design 341
where:
• = the required confidence level
• = the number of failures
• = the total number of units on test
• = the reliability on test
If CL, r and n are given, the R value can be solved from the above equation. When CL=0.5, the solved R (or Q, the
probability of failure whose value is 1-R) is the so called median rank for the corresponding failure. (For more
information on median ranks, please see Parameter Estimation).
For example, given n = 4, r = 2 and CL = 0.5, the calculated Q is 0.385728. This means, at the time when the second
failure occurs, the estimated system probability of failure is 0.385728. The median rank can be calculated in
Weibull++ using the Quick Statistical Reference, as shown below:
Similarly, if we set r = 3 for the above example, we can get the probability of failure at the time when the third
failure occurs. Using the estimated median rank for each failure and the assumed underlying failure distribution, we
can calculate the expected time for each failure. Assume the failure distribution is Weibull, then we know:
Reliability Test Design 342
where:
• is the shape parameter
• is the scale parameter
Using the above equation, for a given Q, we can get the corresponding time t. The above calculation gives the
median of each failure time for CL = 0.5. If we set CL at different values, the confidence bounds of each failure time
can be obtained. For the above example, if we set CL=0.9, from the calculated Q we can get the upper bound of the
time for each failure. The calculated Q is given in the next figure:
If we set CL=0.1, from the calculated Q we can get the lower bound of the time for each failure. The calculated Q is
given in the figure below:
Reliability Test Design 343
Example
In this example you will use the Expected Failure Times plot to estimate the duration of a planned reliability test. 4
units were allocated for the test, and the test engineers want to know how long the test will last if all the units are
tested to failure. Based on previous experiments, they assume the underlying failure distribution is a Weibull
distribution with and .
Solution
Using Weibull++'s Expected Failure Times plot, the expected failure times with 80% 2-sided confidence bounds are
given below.
Reliability Test Design 344
Reliability Test Design 345
From the above results, we can see the upper bound of the last failure is about 955 hours. Therefore, the test
probably will last for around 955 hours.
As we know, with 4 samples, the median rank for the second failure is 0.385728. Using this value and the assumed
Weibull distribution, the median value of the failure time of the second failure is calculated as:
Its bounds and other failure times can be calculated in a similar way.
Example
In this example, you will use the Difference Detection Matrix to choose the suitable sample size and duration for a
reliability test. Assume that there are two design options for a new product. The engineers need to design a test that
compares the reliability performance of these two options. The reliability for both designs is assumed to follow a
Weibull distribution. For Design 1, its shape parameter ; for Design 2, its . Their B10 lives may range
from 500 to 3,000 hours.
Solution
For the initial setup, set the sample size for each design to 20, and use two test durations of 3,000 and 5,000 hours.
The following picture shows the complete control panel setup and the results of the analysis.
Reliability Test Design 346
The columns in the matrix show the range of the assumed B10 life for design 1, while the rows show the range for
design 2. A value of 0 means the difference cannot be detected through the test, 1 means the difference can be
detected if the test duration is 5,000 hours, and 2 means the difference can be detected if the test duration is 3,000
hours. For example, the number is 2 for cell (1000, 2000). This means that if the B10 life for Design 1 is 1,000 hours
and the B10 life for Design 2 is 2,000 hours, the difference can be detected if the test duration is at least 5,000 hours.
Click inside the cell to show the estimated confidence intervals, as shown next. By testing 20 samples each for 3,000
hours, the difference of their B10 lives probably can be detected. This is because, at a confidence level of 90%, the
estimated confidence intervals on the B10 life do not overlap.
Reliability Test Design 347
We will use Design 1 to illustrate how the interval is calculated. For cell (1000, 2000), Design 1's B10 life is 1,000
and the assumed is 3. We can calculate the for the Weibull distribution using the Quick Parameter Estimator
tool, as shown next.
Reliability Test Design 348
The estimated is 2117.2592 hours. We can then use these distribution parameters and the sample size of 20 to get
the expected failure times by using Weibull's Expected Failure Times Plot. The following report shows the result
from that utility.
The median failure times are used to estimate the failure distribution. Note that since the test duration is set to 3,000
hours, any failures that occur after 3,000 are treated as suspensions. In this case, the last failure is a suspension with a
suspension time of 3,000 hours. We can enter the median failure times data set into a standard Weibull++ folio as
given in the next figure.
Reliability Test Design 349
After analyzing the data set with the MLE and FM analysis options, we can now calculate the B10 life and its
interval in the QCP, as shown next.
From this result, we can see that the estimated B10 life and its confidence intervals are the same as the results
displayed in the Difference Detection Matrix.
The above procedure can be repeated to get the results for the other cells and for Design 2. Therefore, by adjusting
the sample size and test duration, a suitable test time can be identified for detecting a certain amount of difference
between two designs/populations.
Reliability Test Design 350
Simulation
Monte Carlo simulation provides another useful tool for test design. The SimuMatic utility in Weibull++ can be used
for this purpose. SimuMatic is simulating the outcome from a particular test design that is intended to demonstrate a
target reliability. You can specify various factors of the design, such as the test duration (for a time-terminated test),
number of failures (for a failure-terminated test) and sample size. By running the simulations you can assess whether
the planned test design can achieve the reliability target. Depending on the results, you can modify the design by
adjusting these factors and repeating the simulation process—in effect, simulating a modified test design—until you
arrive at a modified design that is capable of demonstrating the target reliability within the available time and sample
size constraints.
Of course, all the design factors mentioned in SimuMatic also can be calculated using analytical methods as
discussed in previous sections. However, all of the analytical methods need assumptions. When sample size is small
or test duration is short, these assumptions may not be accurate enough. The simulation method usually does not
require any assumptions. For example, the confidence bounds of reliability from SimuMatic are purely based on
simulation results. In analytical methods, both Fisher bounds and likelihood ratio bounds need to use assumptions.
Another advantage of using the simulation method is that it is straightforward and results can be visually displayed
in SimuMatic.
For details, see the Weibull++ SimuMatic chapter.
351
Chapter 23
Stress-Strength Analysis
Stress-strength analysis has been used in mechanical component design. The probability of failure is based on the
probability of stress exceeding strength. The following equation is used to calculate the expected probability of
failure, F:
The equations above assume that both stress and strength are in the positive domain. For general cases, the expected
reliability can be calculated using the following equation:
where:
When U = infinity and L = 0, this equation becomes equal to previous equation (i.e., the equation for the expected
reliability R).
Variance of and can be estimated from the Fisher Information Matrix. For details, please see
Confidence Bounds.
Once the variance of the expected reliability is obtained, the two-sided confidence intervals can be calculated using:
Stress-Strength Analysis 352
where:
CL is the confidence level
is 1-CL
we know that the calculated reliability is the expected value of the probability that a strength value is larger than a
stress value. Since both strength and stress are random variables from their distributions, the reliability is also a
random variable. This can be explained using the following example. Let's first assume that stress is a fixed value of
567. The reliability then is:
This is the reliability when the stress value is 567 and when the strength distribution is given. If stress is not a fixed
value (i.e., it follows a distribution instead), then it can take values other than 567. For instance, if stress takes a
value of 700, then we get another reliability value of . Since stress is a random variable, for any stress value
, there is a reliability value of calculated from the strength distribution. We will end up with many s
or s. From these s, we can get the mean and variance of the reliability. In fact, its mean is the result
from:
where:
CL is the confidence level
is 1-CL
Example 1
Assume that we are going to use stress-strength analysis to estimate the reliability of a component used in a vehicle.
The stress is the usage mileage distribution and the strength is the miles-to-failure distribution of the component. The
warranty is 1 year or 15,000 miles, whichever is earlier. The goal is to estimate the reliability of the component
within the warranty period (1 year/15,000 miles).
The following table gives the data for the mileage distribution per year (stress):
10096 12405
10469 12527
10955 12536
11183 12595
11391 12657
11486 13777
11534 13862
11919 13971
12105 14032
12141 14138
The following table gives the data for the miles-to-failure distribution (strength):
13507 16125
13793 16320
13943 16327
14017 16349
14147 16406
14351 16501
14376 16611
14595 16625
14746 16670
14810 16749
14940 16793
14951 16862
15104 16930
15218 16948
15303 17024
15311 17041
15480 17263
15496 17347
15522 17430
15547 17805
Stress-Strength Analysis 354
15570 17884
15975 18549
16003 18575
16018 18813
16052 18944
Solution
First, estimate the stress and strength distributions using the given data. Enter the stress and strength data into two
separate data sheets and analyze each data sheet using the lognormal distribution and MLE analysis method. The
parameters of the stress distribution are estimated to be log-mean = 9.411844 and log-std = 0.098741.
The parameters of the strength distribution are estimated to be log-mean = 9.681503 and log-std = 0.083494.
Stress-Strength Analysis 355
Next, open the Stress-Strength tool and choose to compare the two data sheets. The following picture shows the pdf
curves of the two data sets:
Stress-Strength Analysis 356
Since the warranty is 1 year/15,000 miles, all the vehicles with mileage larger than 15,000 should not be considered
in the calculation. To do this, go to the Setup page of the control panel and select the Override auto-calculated
limits check box. Set the value of the upper limit to 15,000 as shown next.
Stress-Strength Analysis 357
Recalculate the results. The estimated reliability for vehicles less than 15,000 miles per year is 98.84%. The
associated confidence bounds are estimated from the variance of the distribution parameters. With larger samples for
the stress and strength data, the width of the bounds will be narrower.
Stress-Strength Analysis 358
The stress distribution is usually estimated from customer usage data, such as the mileage per year of a passenger car
or the load distribution for a beam. The strength distribution, on the other hand, is affected by the material used in
the component, the geometric dimensions and the manufacturing process.
Because the stress distribution can be estimated from customer usage data, we can assume that is known.
Therefore, for a given reliability goal, the strength distribution is the only unknown in the given equation.
The factors that affect the strength distribution can be adjusted to obtain a strength distribution that meets the
reliability goal. The following example shows how to use the Target Reliability Parameter Estimator tool in the
stress-strength folio to obtain the parameters for a strength distribution that will meet a specified reliability goal.
Stress-Strength Analysis 359
Example 2
Assume that the stress distribution for a component is known to be a Weibull distribution with beta = 3 and eta =
2000. For the current design, the strength distribution is also a Weibull distribution with beta =1.5 and eta=4000.
Evaluate the current reliability of the component. If the reliability does not meet the target reliability of 90%,
determine what parameters would be required for the strength distribution in order to meet the specified target.
Solution
The following picture shows the stress-strength tool and the calculated reliability of the current design.
The result shows that the current reliability is about 74.0543%, which is below the target value of 90%. We need to
use the Target Reliability Parameter Estimator to determine the parameters for the strength distribution that, when
compared against the stress distribution, would result in the target reliability.
The following picture shows the Target Reliability Parameter Estimator window. In the Strength Parameters area,
select eta. Set the Target Reliability to 90% and click Calculate. The calculated eta is 8192.2385 hours.
Stress-Strength Analysis 360
Click Update to perform the stress-strength analysis again using the altered parameters for the strength distribution.
The following plot shows that the calculated reliability is 90%. Therefore, in order to meet the reliability
requirement, the component must be redesigned such that the eta parameter of the strength distribution is at least
8192.2385 hours.
Comparing Life Data Sets 361
Simple Plotting
One popular graphical method for making this determination involves plotting the data with confidence bounds and
seeing whether the bounds overlap or separate at the point of interest. This can be easily done using the Overlay Plot
feature in Weibull++. This approach can be effective for comparisons at a given point in time or a given reliability
level, but it is difficult to assess the overall behavior of the two distributions because the confidence bounds may
overlap at some points and be far apart at others.
Contour Plots
To determine whether two data sets are significantly different and at what confidence level, one can utilize the
contour plots provided in Weibull++. By overlaying two contour plots from two different data sets at the same
confidence level, one can visually assess whether the data sets are significantly different at that confidence level if
there is no overlap on the contours. The disadvantage of this method is that the same distribution must be fitted to
both data sets.
Example: Using a Contour Plot to Compare Two Designs
The design of a product was modified to improve its reliability. The reliability engineers want to determine whether
the improvements to the design have significantly improved the product's reliability. The following data sets
represent the times-to-failure for the product. At what significance level can the engineers claim that the two designs
are different?
The data sets are entered into separate Weibull++ standard folio data sheets, and then analyzed with the
two-parameter Weibull distribution and the maximum likelihood estimation (MLE) method. The following figure
shows the contour plots of the data sets superimposed in an overlay plot. This plot is configured to show the contour
lines that represent the 90% and 95% confidence levels.
Comparing Life Data Sets 362
As you can see, the contours overlap at the 95% confidence level (outer rings), but there is no overlap at the 90%
confidence level (inner rings). We can then conclude that there is a statistically significant difference between the
data sets at the 90% confidence level. If we wanted to know the exact confidence level (i.e., critical confidence level)
at which the two contour plots meet, we would have to incrementally raise the confidence level from 90% until the
two contour lines meet.
Weibull++ includes a utility for automatically obtaining the critical confidence level. For two contour plots that are
superimposed in an overlay plot, the Plot Critical Level check box will be available in the Contours Setup window,
as shown next.
Comparing Life Data Sets 363
The plot critical level is the confidence level at which the contour plots of the two data sets meet at a single point.
This is the minimum confidence level at which the contour lines of the two different data sets overlap. At any
confidence level below this minimum confidence level, the contour lines of the two data sets will not overlap and
there will be a statistically significant difference between the two populations at that level. For the two data sets in
this example, the critical confidence level 94.243%. This value will be displayed in the Legend area of the plot.
Note that due to the calculation resolution and plot precision, the contour lines at the calculated critical level may
appear to overlap or have a gap.
where is the pdf of the first distribution and is the reliability function of the second distribution. The
evaluation of the superior data set is based on whether this probability is smaller or greater than 0.5. If the
probability is equal to 0.5, then it is equivalent to saying that the two distributions are identical.
Sometimes we may need to compare the life when one of the distributions is truncated. For example, if the random
variable from the first distribution is truncated with a range of [L, U}, then the comparison with the truncated
distribution should be used. For details, please see Stress-Strength Analysis.
Consider two product designs where X and Y represent the life test data from two different populations. If we simply
wanted to determine which component has a higher reliability, we would simply compare the reliability estimates of
both components at a time . But if we wanted to determine which product will have a longer life, we would want to
calculate the probability that the distribution of one product is better than the other. Using the equation given above,
the probability that X is greater than or equal to Y can be interpreted as follows:
• If , then lives of both X and Y are equal.
Comparing Life Data Sets 364
The following table shows the measurements. Determine the probability that D will fall out of specifications.
Solution
In a Weibull++ standard folio, enter the parts dimensions measurements of each component into separate data sheets.
Analyze each data sheet using the normal distribution and the RRX analysis method. The parameters are:
Next, perform a Monte Carlo simulation to estimate the probability that (A+B+C) will be greater than D. To do this,
choose the User Defined distribution and enter its equation as follows. (Click the Insert Data Source button to
insert the data sheets that contain the measurements for the components.)
Risk Analysis and Probabilistic Design with Monte Carlo Simulation 367
On the Settings tab, set the number of data points to 100, as shown next.
Risk Analysis and Probabilistic Design with Monte Carlo Simulation 368
Click Generate to create a data sheet that contains the generated data points. Rename the new data sheet to
"Simulated A+B+C."
Follow the same procedure to generate 100 data points to represent the D measurements. Rename the new data sheet
to "Simulated D."
Risk Analysis and Probabilistic Design with Monte Carlo Simulation 369
Analyze the two data sets, "Simulated A+B+C" and "Simulated D," using the normal distribution and the RRX
analysis method.
Next, open the Life Comparison tool and choose to compare the two data sheets. The following picture shows the pdf
curves of the two data sets.
Risk Analysis and Probabilistic Design with Monte Carlo Simulation 370
The following report shows that the probability that "Simulated A+B+C" will be greater than "Simulated D" is
16.033%. (Note that the results may vary because of the randomness in the simulation.)
Risk Analysis and Probabilistic Design with Monte Carlo Simulation 371
References
[1] http:/ / help. synthesis8. com/ weibull_alta8/ user-defined_equation_example. htm
[2] http:/ / reno. reliasoft. com
[3] http:/ / www. reliasoft. com/ reno/ examples/ renoexr3/ index. htm
Weibull++ SimuMatic 372
Weibull++ SimuMatic
Reliability analysis using simulation, in which reliability analyses are performed a large number of times on data sets
that have been created using Monte Carlo simulation, can be a valuable tool for reliability practitioners. Such
simulation analyses can assist the analyst to a) better understand life data analysis concepts, b) experiment with the
influences of sample sizes and censoring schemes on analysis methods, c) construct simulation-based confidence
intervals, d) better understand the concepts behind confidence intervals and e) design reliability tests. This section
explores some of the results that can be obtained from simulation analyses using the Weibull++ SimuMatic tool.
The results clearly demonstrate that the median RRX estimate provides the least deviation from the truth for this
sample size and data type. However, the MLE outputs are grouped more closely together, as evidenced by the
confidence bounds. The same figures also show the simulation-based bounds, as well as the expected variation due
to sampling error.
This experiment can be repeated in SimuMatic using multiple censoring schemes (including Type I and Type II right
censoring as well as random censoring) with the included distributions. We can perform multiple experiments with
this utility to evaluate our assumptions about the appropriate parameter estimation method to use for the data set.
by 100 hr. Again, we specify Type II censoring at 100 hr in SimuMatic, and we repeat the simulation with the same
parameters as before. The simulation results in this case yield an expected test duration of 100 hr and a demonstrated
time of 17 hr at the stated requirements. This result is also above our requirement The next figure graphically shows
the results of this experiment. This process can then be repeated using different sample sizes and censoring schemes
until we arrive at a desirable test plan.
Target Reliability Tool 375
where a and b are parameters fitted to market share data, and R is the product reliability.
The function for unit sale price is given by:
where a and b are parameters fitted to data, and R is the product reliability.
As a function of reliability, R, sales revenue is then calculated as:
Sales Revenue (R) = Maximum Market Potential x Market Share (R) x Unit Price (R)
Once the total market value and the sales revenue are obtained, they can then be used to calculate the lost sales cost
using the formula given at the beginning of this section.
Target Reliability Tool 376
Production Cost
Production cost is a function of total market value, market share and manufacturing cost per unit. The function for
production cost per unit is given by:
where a and b are parameters fitted to data, and R is the product reliability.
for which the parameters a and b can be determined using simple regression tools such as the functions in the
Degradation Data Analysis in Weibull++.
Warranty Cost
Warranty cost is a function of total market value, market share, reliability and cost per failure. The function of cost
per failure is given by:
where a and b are parameters fitted to data. For a given reliability value, R, the warranty cost is given by:
Warranty Cost (R) = Maximum Market Potential x Market Share (R) x (1 - R) x Cost Per Failure (R)
Unreliability Cost
The sum of the Lost Sales Cost and Warranty Cost is called Unreliability Cost.
Total Cost
For a given reliability, R, the expected total cost is given by:
Total Cost (R) = Lost Sales Cost (R) + Warranty Cost (R) + Production Cost (R) = Unreliability Cost (R) +
Production Cost (R)
The production cost is a pre-shipping cost, whereas the warranty and lost sales costs are incurred after a product is
shipped. These pre- and post-shipping costs can be seen in the figure below.
The reliability value resulting in the lowest total cost will be the target reliability for the product.
Target Reliability Tool 377
Traditional ROI
First, consider that traditional Return On Investment (ROI) is a performance measure used to evaluate the efficiency
of an investment, or to compare the efficiency of a number of different investments. In general to calculate ROI, the
benefit (return) of an investment is divided by the cost of the investment; and the result is expressed as a percentage
or a ratio. The following equation illustrates this.
In this formula, Gains from Investment refers to the revenue or proceeds obtained due to the investment of interest.
Return on investment is a very popular metric because of its versatility and simplicity. That is, if an investment does
not have a positive ROI, or if there are other opportunities with a higher ROI, then the investment should not be
undertaken. Reliability ROI is computed in a similar manner by looking at the investment as the the investment in
improving the reliability.
These five inputs are then repeated for three specific cases: Best Case, Most Likely and Worst Case.
Target Reliability Tool 378
Based on the above inputs, four models are then fitted as functions of reliability, , or:
An additional variable needed is maximum market potential, M. It is defined by users in the following input box:
All the related costs are defined as given in the previous section and calculated as a function of reliability. The value
giving the lowest total cost is the optimal (target) reliability.
Example
Determining Reliability Based on Costs
The following table provides information for a particular product regarding market share, sales prices, cost of
production and costs due to failure.
The first row in the table indicates the probability of failure during the warranty period. For example, in the best case
scenario, the expected probability of failure will be 1% (i.e., the reliability will be 99%). Under this reliability, the
expected market share is 80%, average unit sale price is $2.10, average cost per unit to produce is $1.50 and other
costs per failure is $0.50.
The assumed maximum market potential is 1,000,000 units and the initial investment is $10,000.
Solution
Enter the given information in the Target Reliability tool in Weibull++ and click the Plot icon on the control panel.
Next, click the Analysis Details button on the control panel to generate a report of the analysis. The following report
Target Reliability Tool 379
The following figure shows the Cost vs. Reliability plot of the cost models. The green vertical line on the plot
represents the estimated reliability value that will minimize costs. In this example, this reliability value is estimated
to be 96.7% at the end of the warranty period.
Target Reliability Tool 380
The following figures show the Profit vs. Reliability plot and the R3OI vs. Reliability plot. In the R3OI plot, the
initial investment is set to $10,000.
Target Reliability Tool 381
Target Reliability Tool 382
Event Log Data Analysis 383
where:
•
• is the date/time of occurrence of .
• is the date/time of restoration of the previous occurrence .
For systems that were new when the collection of the event log data started, the times to first occurrence of every
unique event is equivalent to the date/time of the occurrence of the event minus the time the system monitoring
started. That is:
For systems that were not new when the collection of event log data started, the times to first occurrence of every
unique event are considered to be suspensions (right censored) because the system is assumed to have accumulated
more hours before the data collection period started (i.e., the time between the start date/time and the first occurrence
of an event is not the entire operating time). In this case:
When monitoring on the system is stopped or when the system is no longer being used, all events that have not
occurred by this time are considered to be suspensions.
The four equations given above are valid for cases in which a component operates through the failure of other
components. When the component does not operate through the failures, the assumptions must include the downtime
of the system due to the other failures. In other words, the first four equations become:
Repair times are obtained by calculating the difference between the date/time of event occurrence and the date/time
of restoration, or:
Event Log Data Analysis 384
All these equations should also take into consideration the periods when the system is not operating or not in use, as
in the case of operations that do not run on a 24/7 basis.
Note that:
• The date and time of each failure is recorded.
• The date and time of repair completion for each failure is recorded.
• The repair involves replacement of the responsible component.
• The responsible component for each failure is recorded.
For this example, we will assume that an engineer began recording these events on January 1, 1997 at 12 PM and
stopped recording on March 18, 1997 at 1 PM, at which time the analysis was performed. Information for events
prior to January 1 is unknown. The objective of the analysis is to obtain the failure and repair distributions for each
component.
Solution
We begin the analysis by looking at component A. The first time that component A is known to have failed is
recorded in row 1 of the data sheet; thus, the first age (or time-to-failure) for A is the difference between the time we
began recording the data and the time when this failure event happened. Also, the component does not age when the
Event Log Data Analysis 385
system is down due to the failure of another component. Therefore, this time must be taken into account.
1. The First Time-To-Failure for Component A, TTFA[1]
The first time-to-failure of component A, TTFA[1], is the sum of the hours of operation for each day, starting on the
start date (and time) and ending with the failure date (and time). This is graphically shown next. The boxes with the
green background indicate the operating periods. Thus, TTFA[1] = 5 + 8 = 13 hours.
To illustrate this mathematically, we will use a function, , which, given a range of times, returns the shift hours
worked during that period. In other words, for this example (1/1/97 3:00 AM - 1/1/97 6:00 PM) = 9 hours given an
8 AM to 5 PM shift. Furthermore, we will show the date and time a failure occurred as DTO and the date and time a
repair was completed at DTR with a numerical subscript indicating the row that this entry is in (e.g., DTO4 for the
date and time a failure occurred in row 4).
Then the total possible hours (TPH) that component A could have operated from the time it was repaired to the time
it failed the second time is:
TPH = (DTO4 – DTR1),
TPH = (DTO4 – DTR1) = 9 Days * 9 hours + 7:26 hours = 88:26 hours = 88.433 hours
The time that component A was not operating (NOP) during normal hours of operation is the time that the system
was down due to failure of component B, or:
Event Log Data Analysis 386
presented next.
Since our analysis time ends on March 18, 1997 at 1:00 PM and component A has operated successfully from the
last time it was replaced on March 13, 1997 at 5:13 PM, the additional time of successful operation is:
TPH = (End Time – DTR19) = (4 days * 9 hours/day + 5:00 hours) = 41:00 hours
NOP = ( DTO20 – DTR20) = 7:24 hours
Thus, the remaining time that component A operated without failure is:
TTS = TPH - NOP = 33:36 = 33.6 hours.
The next two tables show component A's failure and repair data. The entire analysis can be repeated to obtain the
failure and repair times for component B.
Use the Shift Pattern window (Event Log > Action and Settings > Set Shift Pattern) to specify the 8:00 AM to
5:00 PM shifts that occur seven days a week, as shown next.
The folio will automatically convert the equipment downtime log data to time-to-failure and time-to-repair data and
fit failure and repair distributions to the data set. To view the failure and repair results, click the Show Analysis
Summary (...) button on the control panel. The Results window shows the calculated values.
Event Log Data Analysis 389
Allowing for the inclusion of an F/E identifier increases the number of items that we will be considering. In other
words, the objective now will be to determine a failure and repair distribution for component A due to failure,
component B due to failure, component A due to an event and component B due to an event. This is mathematically
equivalent to increasing the number of components from two (A and B) to four (FA, EA, FB and EB); thus, the
analysis steps are identical to the ones performed in the first example, but for four components instead of two.
In the Weibull++8 event log folio, to consider the F/E in the analysis, the setting in the Analyze Failures and
Events area of the control panel must be set to Separately. Selecting the Combined option will result in ignoring
the F and E distinction and treating all entries as Fs (i.e., perform the same analysis as in the first example).
Event Log Data Analysis 391
Because component A does not operate through other failures, the analysis steps (and results) are identical to the
ones in the first example. However, the analysis for the times-to-failure for component B will be different (and is
actually less cumbersome than in the first example because we do not need to subtract the time that the system was
down due to the failure of other components).
Specifically:
TTFB[1] = (Start Time - DTO2 ) = 68:30 = 68.5 hours
TTFB[2] = (DTR2 - DTO3) = 7:30 = 7.5 hours
TTFB[3] = (DTR3 - DTO6) = 41.26 hours
The TTR analysis is not affected by this setting and is identical to the result in the first example.
Event Log Data Analysis 393
Analyzing the data for only level 1 will ignore all entries for level 2. However, if performing a level 2 analysis, both
entries in level 1 and 2 are taken into account, creating a unique item. More specifically, the analysis for level 2 will
be done at the sub-component level and depend on the component that the sub-component belongs to. Note that this
is equivalent to reducing the analysis to a single level but looking at each component and sub-component
combination individually through the process. In other words, for level 2 analyses, the data set shown above can be
reduced to a single level analysis for four items: A-A1, A-A2, B-B1 and B-B2.
This component and sub-component combination will yield a unique item that must be treated separately, much as
components A and B were in the first example. The same concept applies when more levels are present. When
specifying OTF (Operate Through other Failures) characteristics at this level, each component and sub-component
combination is treated differently. In other words, it is possible that A-A1 and A-A2 have different OTF
designations, even though they belong to the same component; however, all A-A2s must have the same OTF
designation.
Event Log Data Analysis 394
References
[1] http:/ / www. reliasoft. com/ Weibull/ examples/ rc6/ index. htm
[2] http:/ / www. reliasoft. tv/ weibull/ appexamples/ weibull_app_ex_6. html
395
Appendices
and where and are the least squares estimates of a and b, and N is the number of data points.
(1)
and:
(2)
and:
(3)
and:
(4)
Least Squares/Rank Regression Equations 396
Rank Regression on X
Assume that a set of data pairs (x1, y1), (x2, y2), ... , (xN, yN) were obtained and plotted. Then, according to the
least squares principle, which minimizes the horizontal distance between the data points and the straight line fitted to
the data, the best fitting straight line to these data is the straight line x = + y such that:
Again, and are the least squares estimates of a and b, and N is the number of data points.
To obtain and , let:
(5)
and:
(6)
and:
(7)
and:
(8)
Example
Fit a least squares straight line using regression on X and regression on Y to the following data:
x 1 2.5 4 6 8 9 11 15
y 1.5 2 4 4 5 7 8 10
2 2.5 2 6.25 5 4
3 4 4 16 16 16
4 6 4 36 24 16
5 8 5 64 40 25
6 9 7 81 63 49
7 11 8 121 88 64
and:
and:
Note that the regression on Y is not necessarily the same as the regression on X. The only time when the two
regressions are the same (i.e., will yield the same equation for a line) is when the data lie perfectly on a line.
The correlation coefficient is given by:
Appendix: Maximum Likelihood Estimation Example 400
where are unknown constant parameters that need to be estimated, conduct an experiment and obtain
independent observations, , which correspond in the case of life data analysis to failure times. The
likelihood function (for complete data) is given by:
Even though it is common practice to plot the MLE solutions using median ranks (points are plotted according to
median ranks and the line according to the MLE solutions), this is not completely accurate. As it can be seen from
the equations above, the MLE method is independent of any kind of ranks. For this reason, many times the MLE
solution appears not to track the data on the probability plot. This is perfectly acceptable since the two methods are
independent of each other, and in no way suggests that the solution is wrong.
Notes About
Note that the value of is an estimate because if we obtain another sample from the same population and re-estimate
, the new value would differ from the one previously calculated. In plain language, is an estimate of the true
value of ... How close is the value of our estimate to the true value? To answer this question, one must first
determine the distribution of the parameter, in this case . This methodology introduces a new term, confidence
bound, which allows us to specify a range for our estimate with a certain confidence level. The treatment of
confidence bounds is integral to reliability engineering, and to all of statistics. (Confidence bounds are covered in
Confidence Bounds.)
If are known times-to-failure (and with no suspensions), then the likelihood function is given by:
then:
Then taking the partial derivatives of with respect to each one of the parameters and setting them equal to zero
yields:
and:
and:
Appendix: Maximum Likelihood Estimation Example 402
It should be noted that these solutions are valid only for data with no suspensions, (i.e., all units are tested to failure).
In the case where suspensions are present or all units are not tested to failure, the methodology changes and the
problem becomes much more complicated.
Illustrating with an Example of the Normal Distribution
If we had five units that failed at 10, 20, 30, 40 and 50 hours, the mean would be:
A look at the likelihood function surface plot in the figure below reveals that both of these values are the maximum
values of the function.
Appendix: Maximum Likelihood Estimation Example 403
This three-dimensional plot represents the likelihood function. As can be seen from the plot, the maximum
likelihood estimates for the two parameters correspond with the peak or maximum of the likelihood function surface.
Appendix: Special Analysis Methods 404
The Mathematics
Median ranks are used to obtain an estimate of the unreliability, for each failure at a confidence level.
In the case of grouped data, the ranks are estimated for each group of failures, instead of each failure. For example,
consider a group of 10 failures at 100 hours, 10 at 200 hours and 10 at 300 hours. Weibull++ estimates the median
ranks ( values) by solving the cumulative binomial equation with the appropriate values for order number and
total number of test units. For 10 failures at 100 hours, the median rank, is estimated by using:
with:
One is obtained for the group, to represent the probability of 10 failures occurring out of 30.
For 10 failures at 200 hours, is estimated by using:
where:
where:
1 Exact Failure 10
1 Right Censored 20
2 Left Censored 0 30
2 Exact Failure 40
1 Exact Failure 50
1 Right Censored 60
1 Left Censored 0 70
2 Interval Failure 20 80
1 Interval Failure 10 85
Table B.2- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures
1 Exact Failure 10
2 Exact Failure 40
3 Exact Failure 50
Step 1
For all intervals, we obtain a weighted midpoint using:
Table B.3- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures, Based upon the Parameters and .
Number of Items Type Last Inspection Time Weighted "Midpoint"
1 Exact Failure 10
2 Exact Failure 40
1 Exact Failure 50
Step 2
Now we arrange the data as in Table B.4.
Appendix: Special Analysis Methods 407
Table B.4- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures, in Ascending Order.
1 10
1 39.169
2 40
2 42.837
1 50
Step 3
We now consider the left and right censored data, as in Table B.5.
Table B.5- Computation of Increments in a Matrix Format for Computing a Revised Mean Order Number.
Number of Time of 2 Left Censored t = 1 Left Censored t = 1 Left Censored t = 1 Right Censored t = 1 Right Censored
items Failure 30 70 100 20 t = 60
1 10 0 0
1 39.169 0
2 40 0 0
2 42.837 0 0
1 50 0 0
or:
where is the time-to-failure previous to the time-to-failure and is the number of units associated with that
time-to-failure (or units in the group).
In general, for right censored data:
• The increment term for right censored at time with a time-to-failure of , when is zero.
• When the contribution is:
or:
Appendix: Special Analysis Methods 408
where is the time-to-failure previous to the time-to-failure and is the number of units associated with that
time-to-failure (or units in the group).
Step 4
Sum up the increments (horizontally in rows), as in Table B.6.
Table B.6- Increments Solved Numerically, Along with the Sum of Each Row.
Step 5
Compute new mean order numbers (MON), as shown Table B.7, utilizing the increments obtained in Table B.6, by
adding the number of items plus the previous MON plus the current increment.
1 10 0.419411 1.419411
2 40 0.048630 7.651035
1 50 0.361540 11.173181
Step 6
Compute the median ranks based on these new MONs as shown in Table B.8.
Table B.8- Mean Order Numbers with Their Ranks for a Sample Size of 13 Units.
10 1.419411 0.0825889
40 7.651035 0.5487781
50 11.173181 0.8124983
Step 7
Compute new and using standard rank regression and based upon the data as shown in Table B.9.
Appendix: Special Analysis Methods 409
Time Ranks
10 0.0826889
39.169 0.3952894
40 0.5487781
42.837 0.7106217
50 0.8124983
Step 8 Return and repeat the process from Step 1 until an acceptable convergence is reached on the parameters (i.e.,
the parameter values stabilize).
Results
The results of the first five iterations are shown in Table B.10. Using Weibull++ with rank regression on X yields:
Iteration
1 1.845638 42.576422
2 1.830621 42.039743
3 1.828010 41.830615
4 1.828030 41.749708
5 1.828383 41.717990
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the Weibull shape parameter (unknown a priori, the first of two parameters to be found)
• is the Weibull scale parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of interval failure data groups
• is the number of intervals in group of data intervals
• is the beginning of the interval
• is the ending of the interval
For the purposes of MLE, left censored data will be considered to be intervals with
The solution will be found by solving for a pair of parameters so that and It should be
noted that other methods can also be used, such as direct maximization of the likelihood function, without having to
compute the derivatives.
Appendix: Log-Likelihood Equations 411
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the Weibull shape parameter (unknown a priori, the first of three parameters to be found)
• is the Weibull scale parameter (unknown a priori, the second of three parameters to be found)
• is the time of the group of time-to-failure data
• is the Weibull location parameter (unknown a priori, the third of three parameters to be found)
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of interval data groups
• is the number of intervals in the group of data intervals
• is the beginning of the interval
• and is the ending of the interval
It should be pointed out that the solution to the three-parameter Weibull via MLE is not always stable and can
collapse if In estimating the true MLE of the three-parameter Weibull distribution, two difficulties arise. The
first is a problem of non-regularity and the second is the parameter divergence problem, as discussed in Hirose [14].
Non-regularity occurs when In general, there are no MLE solutions in the region of When
MLE solutions exist but are not asymptotically normal, as discussed in Hirose [14]. In the case of
non-regularity, the solution is treated anomalously.
Weibull++ attempts to find a solution in all of the regions using a variety of methods, but the user should be
forewarned that not all possible data can be addressed. Thus, some solutions using MLE for the three-parameter
Weibull will fail when the algorithm has reached predefined limits or fails to converge. In these cases, the user can
change to the non-true MLE approach (in Weibull++ Application Setup), where is estimated using non-linear
regression. Once is obtained, the MLE estimates of and are computed using the transformation
Appendix: Log-Likelihood Equations 413
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the failure rate parameter (unknown a priori, the only parameter to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in the group of suspension data points
• is the time of the suspension data group
• is the number of interval data groups
• is the number of intervals in the group of data intervals
• is the beginning of the interval
• is the ending of the interval
The solution will be found by solving for a parameter so that Note that for there exists a closed
form solution.
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the failure rate parameter (unknown a priori, the first of two parameters to be found)
• is the location parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in the group of suspension data points
• is the time of the suspension data group
• is the number of interval data groups
• is the number of intervals in the group of data intervals
Appendix: Log-Likelihood Equations 414
and:
or:
This is an unwelcome fact, alluded to earlier in the chapter, that essentially indicates that there is no realistic solution
for the two-parameter MLE for exponential. The above equations indicate that there is no non-trivial MLE solution
that satisfies both It can be shown that the best solution for satisfying the constraint that
is To then solve for the two-parameter exponential distribution via MLE, one can set equal to the
first time-to-failure, and then find a such that
Using this methodology, a maximum can be achieved along the -axis, and a local maximum along the -axis at
, constrained by the fact that . The 3D Plot utility in Weibull++ illustrates this behavior of the
log-likelihood function, as shown next:
Appendix: Log-Likelihood Equations 415
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the mean parameter (unknown a priori, the first of two parameters to be found)
• is the standard deviation parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in the group of suspension data points
• is the time of the suspension data group
• is the number of interval data groups
• is the number of intervals in the group of data intervals
• is the beginning of the interval
• is the ending of the interval
The solution will be found by solving for a pair of parameters so that and
Appendix: Log-Likelihood Equations 416
where:
and:
Complete Data
Note that for the normal distribution, and in the case of complete data only (as was shown in Basic Statistical
Background), there exists a closed-form solution for both of the parameters or:
and:
Appendix: Log-Likelihood Equations 417
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the mean of the natural logarithms of the times-to-failure (unknown a priori, the first of two parameters to
be found)
• is the standard deviation of the natural logarithms of the times-to-failure (unknown a priori, the second of
two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in the group of suspension data points
• is the time of the suspension data group
• is the number of interval data groups
• is the number of intervals in the group of data intervals
• is the beginning of the interval
• is the ending of the interval
The solution will be found by solving for a pair of parameters so that and :
Appendix: Log-Likelihood Equations 418
where:
and:
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the number of subpopulations
• is the proportionality of the subpopulation (unknown a priori, the first set of three sets of parameters to
be found)
• is the Weibull shape parameter of the subpopulation (unknown a priori, the second set of three sets of
parameters to be found)
• is the Weibull scale parameter (unknown a priori, the third set of three sets of parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of groups of interval data points
• is the number of intervals in group of data intervals
• is the beginning of the interval
• is the ending of the interval
Appendix: Log-Likelihood Equations 419
so that:
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the logistic shape parameter (unknown a priori, the first of two parameters to be found)
• is the logistic scale parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of interval failure data group
• is the number of intervals in group of data intervals
• is the beginning of the interval
• is the ending of the interval
For the purposes of MLE, left censored data will be considered to be intervals with
The solution of the maximum log-likelihood function is found by solving for ( so that
Appendix: Log-Likelihood Equations 420
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the loglogistic shape parameter (unknown a priori, the first of two parameters to be found)
• is the loglogistic scale parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of interval failure data groups,
• is the number of intervals in group of data intervals
• is the beginning of the interval
• is the ending of the interval
For the purposes of MLE, left censored data will be considered to be intervals with
The solution of the maximum log-likelihood function is found by solving for ( so that
Appendix: Log-Likelihood Equations 421
or:
Appendix: Log-Likelihood Equations 422
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the Gumbel shape parameter (unknown a priori, the first of two parameters to be found)
• is the Gumbel scale parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of interval failure data groups
• is the number of intervals in group of data intervals
• is the beginning of the interval
• is the ending of the interval
For the purposes of MLE, left censored data will be considered to be intervals with
The solution of the maximum log-likelihood function is found by solving for ( so that:
Appendix: Log-Likelihood Equations 423
or:
where:
• is the number of groups of times-to-failure data points
• is the number of times-to-failure in the time-to-failure data group
• is the gamma shape parameter (unknown a priori, the first of two parameters to be found)
• is the gamma scale parameter (unknown a priori, the second of two parameters to be found)
• is the time of the group of time-to-failure data
• is the number of groups of suspension data points
• is the number of suspensions in group of suspension data points
• is the time of the suspension data group
• is the number of interval failure data groups
• is the number of intervals in group of data intervals
• is the beginning of the interval
• and is the ending of the interval
For the purposes of MLE, left censored data will be considered to be intervals with
The solution of the maximum log-likelihood function is found by solving for ( so that
Appendix: Log-Likelihood Equations 424
Appendix: Life Data Analysis References 425
27. Meeker, W.Q., and Escobar, L.A., Statistical Methods for Reliability Data, John Wiley & Sons, Inc., New York,
1998.
28. Mettas, A, and Zhao, Wenbiao, "Modeling and Analysis of Repairable Systems with General Repair," 2005
Proceedings Annual Reliability and Maintainability Symposium, Alexandria, Virginia, 2005.
29. Montgomery, Douglas C., Design and Analysis of Experiments, John Wiley & Sons, Inc., New York, 1991.
30. Nelson, Wayne, Applied Life Data Analysis, John Wiley & Sons, Inc., New York, 1982.
31. Nelson, Wayne, Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other
Applications, ASA-SIAM, 2003.
32. NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/,
September, 2005.
33. Perry, J. N., "Semiconductor Burn-in and Weibull Statistics," Semiconductor Reliability, Vol. 2, Engineering
Publishers, Elizabeth, N.J., pp. 8-90, 1962.
34. Procassini, A. A., and Romano, A., "Transistor Reliability Estimates Improve with Weibull Distribution
Function," Motorola Military Products Division, Engineering Bulletin, Vol. 9, No. 2, pp. 16-18, 1961.
35. Weibull, Wallodi, "A Statistical Representation of Fatigue Failure in Solids," Transactions on the Royal Institute
of Technology, No. 27, Stockholm, 1949.
36. Weibull, Wallodi, "A Statistical Distribution Function of Wide Applicability," Journal of Applied Mechanics,
Vol. 18, pp. 293-297, 1951.
37. Wingo, Dallas R., "Solution of the Three-Parameter Weibull Equations by Constrained Modified
Quasilinearization (Progressively Censored Samples)," IEEE Transactions on Reliability, Vol. R-22, No. 2, pp.
96-100, June 1973.
38. Guo, Huairui, Jin, Tongdan, and Mettas, Adamantios. "Design Reliability Demonstration Tests for One-Shot
Systems Under Zero Component Failure," IEEE Transactions on Reliability, Vol. 60, No. 1, pp. 286-294, March
2011.
39. Hirose, H. “Bias Correction for the Maximum Likelihood Estimation in Two-parameter Weibull Distribution,”
IEEE Transactions on Dielectrics and Electrical Insulation, Vol. 6, No.1, February 1999.
40. Ross, R. “Bias and Standard Deviation Due to Weibull Parameter Estimation for Small Data Sets,” IEEE
Transactions on Dielectrics and Electrical Insulation, Vol. 3, No.1, February 1996.