Robust CUSUM Control Charting PDF
Robust CUSUM Control Charting PDF
Robust CUSUM Control Charting PDF
INTRODUCTION
Statistical process control is a collection of different techniques that
help differentiate between the common cause and the special cause var-
iations in the response of a quality characteristic of interest in a process.
Out of these techniques, the control chart is the most important one. It is
used to monitor the parameters of a process, such as its location and
spread. Some major classifications of the control charts are variable and
attributes charts, univariate and multivariate charts, and memoryless charts
(Shewhart’s type) and memory charts (like cumulative sum [CUSUM] and
exponentially weighted moving range [EWMA]). The commonly used
Shewhart’s variable control charts are the mean (X), median, and
mid-range charts for monitoring the process location and the range (R),
standard deviation (S), and variance (S2) charts for monitoring the process
variability (cf. Montgomery 2009). The main deficiency of Shewhart-type
control charts is that they are less sensitive to small and moderate shifts
in the process parameter(s).
Another approach to address the detection of small shifts is to use
Address correspondence to Dr. memory control charts. CUSUM control charts proposed by Page (1954)
Muhammad Riaz, Assistant Professor,
Department of Mathematics and and EWMA control charts proposed by Roberts (1959) are two commonly
Statistics, King Fahad University of used memory-type control charts. These charts are designed such that they
Petroleum and Minerals, P.O. Box
1737, Dhahran 31261, Saudi Arabia. use the past information along with the current information, which makes
E-mail: [email protected] them very sensitive to small and moderate shifts in the process parameters.
211
In order to obtain efficient control limits for phase performance. McDonald (1990) proposed the use
II monitoring, it is generally assumed that the process of the CUSUM chart that is based on a nonparametric
has a stable behavior when the data are collected statistic. He used the idea of ranking the observations
during phase I (cf. Vining 2009; Woodall and first and then using those ranks in the CUSUM struc-
Montgomery 1999). Most of the evaluations of ture. Hawkins (1993) proposed a robust CUSUM
existing control charts depend on the assumptions chart for individual observations based on Winsori-
of normality, no contaminations, no outliers, and zation. MacEachern et al. (2007) proposed a robust
no measurement errors in phase I for the quality CUSUM chart based on the likelihood of the variate
characteristic of interest. In case of violation of these and named their newly proposed chart the RLCU-
assumptions, the design structures of the charts SUM chart. Li et al. (2010) proposed a nonparametric
lose their performance ability and hence are of less CUSUM chart based on the Wilcoxon rank-sum test.
practical use. There are many practical situations Reynolds and Stoumbos (2010) considered the
where non-normality is more common (see, for robustness of the CUSUM chart for monitoring the
example, Janacek and Miekle 1997). One of the process location and dispersion simultaneously. Midi
solutions to deal with this is to use control charts and Shabbak (2011) studied the robust CUSUM con-
that are robust against violations of the basic trol charting for the multivariate case. Lee (2011) pro-
assumptions, like normality. posed the economic design of the CUSUM chart for
To date, numerous papers on robust control charts non-normally serially correlated data. S. F. Yang
have been published. Langenberg and Iglewicz and Cheng (2011) proposed a new nonparametric
(1986) suggested using the trimmed mean X and R CUSUM chart that is based on the sign test. They
charts. Rocke (1992) proposed the plotting of X studied the ARL performance of the proposed chart
and R charts with limits determined by the mean of for monitoring different location parameters. Simi-
the subgroup interquartile ranges and showed that larly, much work has been done in the direction of
this method resulted in easier detection of outliers robust EWMA control charting; for example, see
and greater sensitivity to other forms of out-of- Amin and Searcy (1991), S. F. Yang et al. (2011),
control behavior when outliers are present. Tatum and Graham et al. (2012).
(1997) suggested an interesting method for robust Most of the CUSUM control charting techniques
estimation of the process standard deviation for con- discussed above are based on first transforming the
trol charts. Moustafa and Mokhtar (1999) proposed a observed data into a nonparametric statistic and then
robust control chart for location that uses the applying the CUSUM chart on that transformed stat-
Hodges-Lehmann and the Shamos-Bickel-Lehmann istic. Unlike these approaches, L. Yang et al. (2010)
estimators as estimates of location and scale para- proposed the use of a robust location estimator
meters, respectively. Wu et al. (2002) studied the (i.e., the sample median) with the CUSUM control
median absolute deviations–based estimators and structure. Extending their approach, in this article
their application to the X chart. Moustafa (2009) we present a robust CUSUM chart that is based on
modified the Shewhart chart by introducing the five different estimators for monitoring the process
median as a robust estimator for location and absol- location of phase II samples. The performance of
ute deviations to median as a robust estimator for the CUSUM chart with different robust estimators is
dispersion. Recently, properties and effects of viola- studied in the presence of disturbances to normality,
tions of ideal assumptions (e.g., normality, outliers contaminations, outliers, and special causes in the
free environment, no special causes, etc.) on the process of interest. Before moving on toward the
control charts have been studied in detail by Riaz robust estimators, we provide the basic structure of
(2008) and Schoonhoven et al. (2011a, 2011b). the CUSUM chart in the next section.
Some authors have also discussed the robustness
of the CUSUM chart to situations where the underly- THE CLASSICAL MEAN CUSUM
ing assumptions are not fulfilled. Lucas and Crosier
(1982) studied the robustness of the standard CUSUM
CONTROL CHART
chart and proposed four methods to reduce the The mean CUSUM control chart proposed by Page
effect of outliers on the average run length (ARL) (1954) has become one of the most popular methods
the proposed CUSUM charts is positively skewed as and S0 ¼ Hh^=2 at ARL0 ffi 370. The results are given
long as there is some variation in the run lengths. in Table 5, where the uncontaminated normal distri-
By decreasing the value of kh^ from 0.5 to 0.25, bution is taken as the parent environment.
the standard deviation of all of the proposed charts It is gratifying to note that the FIR feature indeed
also decreases for small values of d. The percentiles enhances the performance of the proposed CUSUM
can also be used to compare the median run length charts. In the upcoming sections we will not give
(MRL) for CUSUM charts with different estimators. the additional tables for the SDRL, percentile points
Lucas and Crosier (1982) proposed the use of a of the run length, and the FIR features, although they
fast initial response (FIR) feature with the CUSUM can be easily obtained along the same lines.
charts in which they recommended not to set the
initial values of CUSUM statistics equal to zero. They Variance-Contaminated Normal
found that the choice of a head start S0 ¼ Hh^=2 is
optimal in the sense that its effect on the ARL0 is very
Environment
minor but it significantly decreases the ARL1 values. A (u) 100% variance-contaminated normal distri-
The intention of FIR was to enhance the CUSUM bution is one that contains
(1 u) 100% observa-
2
chart sensitivity in detecting the shifts that occur tions from
N l0 ; r0 and (u) 100% observations
immediately after the start of the process. As an from N l0 ; sr20 , where 0 < s < 1. The ARL values
example, we provide the ARL values of FIR CUSUM of the CUSUM charts using different estimators for
based on different estimators for n ¼ 10, kh^ ¼ 0:5, the variance-contaminated normal environment are
TABLE 2 ARL Values for the CUSUM Chart Based on Different Estimators under Uncontaminated Normal Distribution with
kh^ ¼ 0:5 and hh^ ¼ 4:774
kh^ ¼ 0:25 Mean 360.523 6.590 1.902 0.994 0.641 0.437 0.164
hh^ ¼ 8:03 Median 356.984 9.085 2.477 1.264 0.824 0.533 0.241
Mid-range 360.781 12.457 3.195 1.611 1.021 0.564 0.459
HL 350.709 6.980 1.991 1.032 0.669 0.469 0.142
TM 356.536 7.674 2.143 1.099 0.706 0.501 0.130
kh^ ¼ 0:5 Mean 367.992 9.498 1.963 0.939 0.577 0.465 0.355
hh^ ¼ 4:774 Median 363.205 13.603 2.766 1.245 0.764 0.403 0.499
Mid-range 361.144 20.159 3.764 1.623 0.975 0.479 0.446
HL 367.477 10.194 2.141 0.988 0.626 0.450 0.395
TM 365.022 11.056 2.278 1.043 0.647 0.421 0.439
given in Table 6 with / ¼ 0.05 and s ¼ 9 and in the most by this variance contamination. Similarly,
Table 7 with / ¼ 0.1 and s ¼ 9. in terms of ARL1 values, TM and HL outperform all
In Tables 6 and 7, the ARL0 for the TM and HL of the other estimators under discussion.
CUSUM charts are affected least by the contami-
nation, whereas the mid-range estimator is affected Location-Contaminated Normal
Environment
TABLE 4 Percentile Run Length Values for the CUSUM Chart
Based on Different Estimators under Uncontaminated Normal Dis- A (u) 100% location-contaminated normal distri-
bution is the one that contains (1 u) 100% observa-
tribution with n ¼ 10, kh^ ¼ 0:5, and hh^ ¼ 4:774 at ARL0 ffi 370
2
d tions
from N l r
0 0 and (u) 100% observations from
;
2
N l0 þ xr0 ; r0 , where 1 < x < 1. The ARL
Estimator Percentile 0 0.25 0.5 0.75 1 1.5 2
values of the CUSUM chart using different estimators
Mean P10 44 6 3 2 2 1 1 for a location-contaminated normal environment are
P25 111 8 4 3 2 1 1 given in Table 8 with / ¼ 0.05 and x ¼ 4.
P50 255 12 5 3 2 2 1
Table 8 shows that none of the estimators is able
P75 504 19 6 4 3 2 1
P90 833.1 27 8 4 3 2 2
to adequately detect the location contamination in
Median P10 45 7 3 2 2 1 1 the process when n ¼ 5, because the ARL0 for all
P25 112 10 4 3 2 2 1 the estimators is substantially lower than that for
P50 256 16 6 4 3 2 1 the uncontaminated environment. Increasing the
P75 501 25 8 5 3 2 2 subgroup size may be a better option because the
P90 843 37 10 5 4 2 2
mid-range and median CUSUM charts have a reason-
Mid-range P10 46 8 4 3 2 2 1
P25 113 12 5 3 3 2 1
able ARL0 for n ¼ 10 but the ARL1 performance of the
P50 255 21 7 4 3 2 2 mid-range CUSUM chart is way too poor compared
P75 504 35 10 5 4 2 2 to the median CUSUM.
P90 828 53 13 7 4 3 2
HL P10 44 6 3 2 2 1 1
P25 111 9 4 3 2 1 1
Special Cause Environment
P50 260 13 5 3 2 2 1 Asymmetric variance disturbances are created in
P75 518 20 7 4 3 2 1
which each observation is drawn from N(0,1) and
P90 847.1 29 8 5 3 2 2
TM P10 44 6 3 2 2 1 1
has a u probability of having a multiple of a v2ð1Þ vari-
P25 109 9 4 3 2 2 1 able added to it, with a multiplier equal to 4. ARLs of
P50 258 14 5 3 2 2 1 the CUSUM chart under this environment with
P75 517 21 7 4 3 2 2 / ¼ 0.01 and / ¼ 0.05 are given in Tables 9 and 10,
P90 846.1 31 9 5 3 2 2 respectively.
In the presence of special causes, the median followed by the HL CUSUM. Median CUSUM has also
CUSUM seems more robust, whereas the mid-range reasonable performance compared to the others, but
CUSUM is affected the most. TM and HL CUSUM the mid-range CUSUM seems to have worst perfor-
have good detection ability with a reasonable ARL0. mance for the said case (cf. Table 11). For the logistic
TM and HL estimators are affected positively by the distribution, the HL and TM CUSUM outperform the
increase in subgroup size; that is, their ARL0 increase median and mid-range CUSUM charts, whereas the
as we increase the value of n and vice versa. mean CUSUM reasonably maintains its performance
(cf. Table 12). Similarly, TM and mean CUSUM charts
show very good performance in case of chi-square
Non-normal Environments
distribution. HL also performs well, whereas the
To investigate the effect of using non-normal distri- mid-range CUSUM has the worst performance (cf.
butions, we consider two cases: one by changing the Table 13).
kurtosis and the other by changing the symmetry of We provide the ARL curves of the CUSUM charts
the distribution. For the case of disturbing the kurtosis, with different estimators under different environ-
we use Student’s t distribution with 4 degrees of free- ments discussed above. Figures 1–5 contain the
dom (T4) and the logistic distribution (Logis(0,1)), and ARL curves of the CUSUM charts based on different
for the disturbance in symmetry we use the chi-square
estimators with n ¼ 10, kh^ ¼ 0:5, and hh^ ¼ 4:774.
distribution with 5 degrees of freedom v2ð5Þ . From Figures 1–5 we see that the ARLs of the
Tables 11–13 contain the ARL values for the proposed mid-range CUSUM are affected the most in case of
CUSUM charts under T4, Logis(0,1), and v2ð5Þ , respect- some non-normal, contaminated, and special cause
ively, where the ARL0 is kept fixed at 370. environments. The mean CUSUM is also poorly
For the case of Student’s t distribution, TM CUSUM influenced under special cause environments. The
performs the best among all of the other estimators ARL0 values of the median, TM, and HL CUSUM
TABLE 6 ARL Values for the CUSUM Chart Based on Different Estimators under 5% Variance-Contaminated Normal Dis-
tribution with kh^ ¼ 0:5 and hh^ ¼ 4:774
TABLE 8 ARL Values for the CUSUM Chart Based on Different Estimators under 5% Location-Contaminated Normal
Distribution with kh^ ¼ 0:5 and hh^ ¼ 4:774
seem less affected by the change of the parent the charts under T4, Logis(0,1), and v2ð5Þ distributions
normal environment. are given in Figures 6, 7, and 8, respectively, with
For a graphical comparison of the proposed charts kh^ ¼ 0:5 and ARL0 fixed at 370. These figures clearly
with non-normal environments, the ARL curves of all indicate that, in general, TM and HL CUSUM are
TABLE 9 ARL Values for the CUSUM Chart Based on Different Estimators under Special Cause Normal Distribution with
u ¼ 0.01, kh^ ¼ 0:5, and hh^ ¼ 4:774
TABLE 11 ARL Values for the CUSUM Chart Based on Different Estimators under T4 Distribution with n ¼ 10 and
kh^ ¼ 0:5 at ARL0 ffi 370
TABLE 12 ARL Values for the CUSUM Chart Based on Different Estimators under Standard Logistic Distribution
with n ¼ 10 and kh^ ¼ 0:5 at ARL0 ffi 370
TABLE 13 ARL Values for the CUSUM Chart Based on Different Estimators under Chi-square Distribution with n ¼ 10
and kh^ ¼ 0:5 at ARL0 ffi 370
FIGURE 2 ARL curves of the median CUSUM with n ¼ 5, FIGURE 5 ARL curves of the TM CUSUM with n ¼ 5, kh^ ¼ 0:5,
kh^ ¼ 0:5, and hh^ ¼ 4:774 under different parent environments. and hh^ ¼ 4:774 under different parent environments.
performing better than the other estimators under chart (NPCUSUM) by S. F. Yang and Cheng (2011)
Student’s t and logistic distributions. All of the esti- under different environments discussed previously.
mators (except the mid-range) perform equally well S. F. Yang and Cheng (2011) calculated the ARLs of
in case of the chi-square parent environment. NPCUSUM using the shift parameter p1. For a valid
Finally, we provide a comparison of our proposed comparison of NPCUSUM with our proposed charts,
CUSUM charts with the nonparametric CUSUM mean we evaluated the ARL values of NPCUSUM chart
FIGURE 3 ARL curves of the midrange CUSUM with n ¼ 5, FIGURE 6 ARL curves of different CUSUM charts under T4
kh^ ¼ 0:5, and hh^ ¼ 4:774 under different parent environments. distribution with n ¼ 10, kh^ ¼ 0:5 and ARL0 ffi 370.
TABLE 14 ARL Comparison for the NPCUSUM (with n ¼ 10, k ¼ 0.5, and h ¼ 10.65) and TM CUSUM (with n ¼ 10 and kh^ ¼ 0:5) Charts
under Different Environments with Prefixed ARL0 ¼ 370
FIGURE 10 For 10% variance-contaminated data: (a) patients’ FIGURE 11 For 5% location-contaminated data: (a) patients’
waiting times; (b) output of mean CUSUM; (c) output of median waiting times; (b) output of mean CUSUM; (c) output of median
CUSUM; (d) output of mid-range CUSUM; (e) output of HL CUSUM; CUSUM; (d) output of mid-range CUSUM; (e) output of HL CUSUM;
and (f) output of TM CUSUM. and (f) output of TM CUSUM.
ACKNOWLEDGMENT
The author Muhammad Riaz is indebted to King
Fahd University of Petroleum and Minerals, Dhahran,
FIGURE 12 For special cause environment with u ¼ 0.05: (a) Saudi Arabia, for providing excellent research
patients’ waiting times; (b) output of mean CUSUM; (c) output
of median CUSUM; (d) output of mid-range CUSUM; (e) output facilities through project SB111008.
of HL CUSUM; and (f) output of TM CUSUM.