MTSD-TCC: A Robust Alternative to Tukey's Control Chart (TCC) Based on the Modified Trimmed Standard Deviation (MTSD)

In this study, a robust control chart as an alternative to the Tukey’s control chart (TCC) based on the modified trimmed standard deviation (MTSD), namely MTSD-TCC, is proposed. The performance of the proposed and the competing Tukey’s control chart (TCC) is measured using different length properties such as average run length (ARL), standard deviation of run length (SDRL), and median run length (MDRL). Also, the study covered normal and contaminated cases. We have observed that the proposed robust control chart (MTSD-TCC) is quite efficient at detecting process shifts. Also, it is evident from the simulation results that the proposed robust control chart (MTSD-TCC) offers superior detection ability for different trimming levels as compared to the Tukey’s control chart (TCC) under the contaminated process setups. As a result, it is recommended to use the proposed robust control chart (MTSD-TCC) for process monitoring. An application numerical example using real-life data is provided to illustrate the implementation of the proposed robust control chart (MTSD-TCC) which also supported the results of the simulation study to some extent.


Introduction
Statistical process control (SPC), a compendium of valuable techniques for process monitoring, is widely used for checking, measuring, controlling and improving the quality of production in many fields of applications, including industry, manufacturing, finance, economics, epidemiology, health care, environmental sciences among others (Sukparungsee, 2013). A control chart is extensively used to examine assignable causes in process parameters. The Shewhart control chart, one of the popular tool in SPC, is used for monitoring industrial processes assumes that the observations from a process are independently, identically distributed from a normal distribution (Sindhumol et al., 2016). A control chart that hinges on the distributional assumption (such as normality) is identified as a parametric control chart. In practice, it is problematical to meet these stringent assumptions. In such circumstances, non-parametric and robust control charts are employed that do not rest on the underlying distribution of the variable of interest and can still accomplish better if the data suggests departures from normality and outliers in it.
The presence of outliers may consist of (single or many) abnormal values which happen due to a random cause. The occurrence of outliers will reduce the sensitivity of a control chart because the control limits are enlarged and it makes detection of the outliers themselves becomes more challenging. Hackl and Ledolter (1991) stressed that conventional control charts lack robustness with outliers in the sense that a single extreme outlying observation can activate an out-of-control signal when in fact the process center has not altered. They contended that robust procedures are valuable where outliers or contaminated data exist, such as in situations involving difficult analytical procedures. Alemi (2004) proposed the non-parametric Tukey's control chart (TCC) for individual monitoring with no distributional assumptions about the data and can be used with all probability distributions. The Tukey's control chart (TCC), an easy and simple to setup control limits, uses the idea of a boxplot that guarantees robustness to an outlier. Furthermore, TCC does not use the mean or standard deviation to construct the control chart limits, but uses only quartiles (Mekparyup et al., 2014).
The performance of TCC under different distributional setups have been investigated, for example, by Borckardt et al. (2005), Borckardt et al. (2006), Torng and Lee (2008), and Sukparungsee (2012) for the symmetric and skewed data. Lee (2011) suggested asymmetrical control limits to deal with skewed data, and Sukparungsee (2013) evaluated its robustness. Tercero-Gomez et al. (2012) proposed some modifications of TCC to increase its performance for optimum results. Lee and Torng (2015) and Lee et al. (2013) introduced some further adjustments to its design. Khaliq et al. (2014) used different performance strategies to compare TCC and X/MR charts and accordingly, TCC appears as an efficient choice in case of skewed data.
Only a few studies have attempted to modify TCC based on robust estimators; Mekparyup et al. (2014) worked with modification to inter-quartile range (IQR) to set the symmetrical and asymmetrical control limits coefficients of TCC and concluded that the ATCC is more efficient to detect the process when it is in-control. Recently, Abu-Shawiesh et al. (2019) proposed a modification of the TCC based on robust estimators for IQR, in order to reduce probability of type I error, as an alternative procedure. These robust estimators are MAD, sample quantile (Sn) and sample quantile (Qn). The modified control charts are named as MAD-TCC, Sn-TCC and Qn-TCC. It is observed that the Qn-TCC has the best process monitoring performance. The sensitivity of the TCC to outliers is the main concern of this study. It is well known that there are other scale measures that perform better than the IQR because of their robustness properties. Among those robust scale measures, measures based on trimming is more attractive for data. The Modified Trimmed Standard Deviation (MTSD) denoted by which was introduced by Sindhumol et al. (2016) is used in this study as an alternative to the IQR and, the authors deemed that it would be interesting to construct a modified Tukey's control chart (TCC) based on MTSD ( ). This modified control chart is named in this paper as MTSD-TCC.
In section 2, discussion of robust estimators of process variability is provided. The trimmed mean and the modified trimmed standard deviation are reviewed in section 3. Section 4 discusses and defined the TCC design. Section 5 proposed the design for robust MTSD-TCC. A simulation study is performed to compare MTSD-TCC relative to the TCC to evaluate the performance of the proposed modified method in Section 6. An applied numerical example is given in Section 7. Finally, Section 8 concludes and summarizes the findings and outcomes of the paper.

The Robust Estimators of Process Variability
Fundamentally, when the underlying normality assumption is violated, vastly improved methods known as the robust methods are used. These methods offer the opportunity to achieve more accurate results, often yielding superior statistical power and improved sensitivity and yet still be efficient if the normal assumption is correct (Abu-Shawiesh, 2008). A robust estimator is the one that is resistant to departures from normal distribution and the presence of outliers.
A number of robust estimators, based on median being most resistant against outliers, is available in literature. It includes (i) median absolute deviation from the sample median (MAD), (ii) the two robust measures of scale and proposed by Rousseeuw and Croux (1993)  It has been shown that these robust estimators have good performance for wide range of probability distributions. In this paper, we study one robust scale estimator namely Modified Trimmed Standard Deviation (MTSD), , proposed by Sindhumol et al. (2016). This robust estimator of the process variability will be used in the construction of the proposed robust control chart (MTSD-TCC) for better examining of changes in the process.

The Modified Trimmed Standard Deviation (MTSD)
It is well known fact that the tails of a distribution can dominate its value sample mean and variation from the mean. In order to circumvent it, simply discard observations in the tails of the distribution in which case the trimmed mean (Tukey, 1948) and its standard error are more natural alternative because of its computational simplicity (Dixon and Yuen, 1974). Apart from that, these measures are less affected by departures from normality than the usual mean and standard deviation, as observations in the tail are removed. But trimming makes a reduction in dispersion and estimating population dispersion based standard error of trimmed mean will not give a clear picture of actual dispersion because of trimming samples. As an alternative, a standard estimator of variance of trimmed mean is obtained through Winsorization (Wilcox, 2012). Caperaa and Rivest (1995) obtained an exact formula for variance of the trimmed mean as a function of order statistics, when trimming percentage is small. However, this variance or its variance modification based on Winsorization, are not very helpful and has a limited exposure to applications in literature. Sindhumol et al. (2016) improved the variance of the trimmed mean by multiplying it with a tuning constant to reduce the effect of loss due to trimming so that its robust qualities are not much spoiled. Let X be the quality variable of interest following and let X 1 , X 2 , . . . , X n be a random sample of size (n) taken from the population. Let denote an order statistics sample of size (n), from a population having symmetric distribution. The r-times symmetrically trimmed sample is obtained by dropping both r-lowest and r-highest values.
Here r = [ ] represents the greatest integer that is less than or equal to and trimming is done for ( ) of n, then the trimmed mean can be calculated as follows: (1) and the sample standard deviation of observations from the trimmed mean is denoted by which can be calculated as follows: (2) The modified estimator for the population standard deviation (σ) based on trimmed standard deviation named Modified Trimmed Standard Deviation (MTSD) or as given by Sindhumol et al. (2016) is (3) The constant multiplier 1.4826 is used to cover the loss of information due to trimming and as percentage of trimming is increased, the constant gives a control on loss due to trimming. According to Sindhumol et al. (2016), the key properties for the modified trimmed standard deviation are as follows: (i) The modified estimate of dispersion using trimmed mean and the tuning constant are relevant and meaningful. It is noteworthy that this is the same constant introduced by Hampel (1974) for the median absolute deviation from the sample median (MAD). (ii) The loss of information due to trimming is covered by multiplying a tuning constant with standard deviation of the trimmed data. This constant is selected in such a way that, it will compensate the loss due to maximum allowable trimming.
(iii) In order to fix this tuning constant, a simulation study is conducted and validated the effect multiplier 1.4826 on modified trimmed standard deviation. The constants 0.7803041 (10% trimming), 1.1881829 (20% trimming), 1.4826022 (25% trimming) and 1.9069394 (30% trimming) are used as multipliers to compare its performance with classical non-robust estimators. (iv) The modified estimate of dispersion is equivalent to 25% trimmed mean for a symmetric trimming. (v) The Relative Efficiency (RE) of the estimator based on the variance of the trimmed mean is given by: (4) and this value for relative efficiency (RE) is also influenced by the percentage of trimming, while estimating . The efficiency will also improve if the sample size (n) and trimming percentage are increased.

The Tukey's Control Chart (TCC)
The Tukey's control chart (TCC) is simple and easy to use. It has an effective charting structure that exhibits robustness for the skewed distribution. The TCC was first proposed by Alemi (2004) who applies the principles of the boxplot. The lower control limit ( ) and the upper control limit ( ) for this control chart are constructed as follows: (5) where Q 1 = F -1 (0.25) is the first quartile, Q 3 = F -1 (0.75) is the third quartile and the interquartile range IQR = Q 3 -Q 1. The constant determines the width of control limits depending upon the pre-specified rate of false alarms. Under the normal distribution with population mean μ and population standard deviation σ, the parameter is usually set as (Ryan, 2000).

The Proposed Robust Tukey's Control Chart (MTSD-TCC)
In this section, we extend the idea of TCC by introducing the design for the proposed robust Tukey's control chart (MTSD-TCC) method based on the modified trimmed standard deviation (MTSD) or as an alternative to the TCC. The proposed modified control chart is computationally simple, easy to use and therefore analytically a more desirable method. The control limits for the proposed MTSD-TCC method are obtained as follows: Under the normal distribution with population mean μ and population standard deviation σ, the IQR equals 1.34898 σ and the MTSD equals 1.4826 σ so that MTSD is twice as much as IQR, then the parameter is set as in order to equal the control limits of the TCC.

Performance Measures
In this section, we carry out performance analysis of the proposed robust method (MTSD-TCC) and the competing TCC. There is a variety of performance measures that may be used to evaluate the ability of control charts. We have used several measures based on run length (RL) for different amounts of shifts ( ) in a process. These measures have two classifications: measures for specific shifts and measures based on overall shifts in the process parameters. For our study purposes, we have considered mean shifts in terms of δσ 0 , i.e., the shifted mean μ 1 is defined by μ 1 =μ 0 +δσ 0 . These measures include: The measures ARL, MDRL and SDRL are individual measures that assess the performance of control charts for particular shifts. The measure EQL is an overall measure that evaluates the overall performance over a range of shifts. A brief description of these measures is given below. For further details of such measures one may see Ahmad et al. (2014a&b), Abbasi et al. (2015), , , Riaz et al. (2017), Abbas et al. (2017), Ahmad et al. (2018), Hussain et al. (2018), and the references therein.

Run Length Properties
A series of points plotted on a chart until an out-of-control signal is indicated is called a run. The number of points in a run is termed as run length (RL). The in-control RL states stable state of a process, while out-of-control RL relates to shifted state of the process. We expect the in-control RL to be higher and the out-of-control RL to be lower (depending on the amount of shift). Some properties based on RL are described below.

Average Run Length (ARL)
The average run length (ARL) is the most popular measure of control charts that is defined as the average number of points we have to wait to receive the first out-of-control signal. ARL and ARL are two types referring to in-control and out-of-control process scenarios. A control chart is considered to be more effective than the others if it offers smaller ARL 1 at fixed choices of ARL 0 . Let X be a quality characteristic of interest, then we may define the ARL as follows: The ARL may not suffice for all cases as the RL distribution is generally skewed. It necessitates the use of some other properties like median RL (MDRL).

Median Run Length (MDRL)
The RL distribution is generally right tailed so ARL may not provide a good representation of the performance of a control chart. Median, being robust to outliers, is a more accurate performance indicator of a control chart. The MDRL refers to the mid-point of run length distribution (i.e. the point that covers 50% of the area). We may define the MDRL as follows:

Standard Deviation Run Length ( )
The ARL and MDRL provide the information of central tendency of RL. The variation in RL values is addressed by the standard deviation of RL. Therefore, the standard deviation run length (SDRL) is another useful measure used to assess the spread of the run length behavior. We may define the SDRL as follows:

Extra Quadratic Loss ( )
One commonly used overall performance measure include extra quadratic loss (EQL). It is used to judge the overall performance of a control chart. The EQL is the weighted ARL with respect to range of shift ( to ) by considering square of shift ( ) as weight. Any control chart having smaller EQL values is considered a better choice. We may define the EQL measure as follows: ∫ where and are the minimum and maximum values of δ.

Assessment Criterion
Using the above-mentioned performance measures, we may declare a control chart superior than its competing counterparts in the light of the following criterion: A control chart is considered better than the other control charts if it offers smaller values for the performance measures such as ARL, MDRL, SDRL and EQL.

Performance Evaluations and Comparisons
In this section, we assess the performance measures of different control charts considered in this study. Moreover, we offer the comparative analysis of our proposed MTSD-TCC control chart and the classical TCC control   Table 2 presents the SDRL results of these control charts for individual observations, Table 3 lists the MDRL values of these control charts for individual observations. Tables 4 and 5 present the ARL profiles of TCC and MTSD-TCC respectively for various sample sizes, along with the theoretical results. Based on these results given in Tables 1-5, the following subsection provides the findings related to the two designs, the classical TCC and the proposed MTSD-TCC.

Run Length Performance of MTSD-TCC Versus TCC
This section deals with the comparative performance of MTSD-TCC versus TCC using ARL, SDRL and MDRL results given in Tables 3-5   General Comments: MTSD-TCC has better run length properties (ARL, MDRL, SDRL, EQL) as compared to TCC (under both in-control and out-of-control situations) for all trimming levels (5%, 10%, 20%, 30% and 40%), especially for δ ≤ 1. For δ > 1, both designs (MTSD-TCC and TCC) offer quite similar performance as may be seen in Tables 1-3.

Effects of Sample Size on ARL Profile of MTSD-TCC Versus TCC
Tables 4 and 5 demonstrate the ARL 0 and ARL 1 values of MTSD-TCC and TCC using sample sizes n = 30, 100, 1000 and theoretical values at pre specified the ARL 0 = 148. The performance comparison of both control charts are as follows:  The MTSD-TCC showed better ARL 1 values than the TCC at α = 5% and α = 20% and δ < 1.50. For α = 10% both appeared as equally efficient except δ = 0.25. The overall performance of MTSD-TCC turned out to be better than TCC at n = 30. (cf. .


The MTSD-TCC is equally efficient as TCC except δ = 0.25 at n = 100. TCC appeared more efficient at δ = 0.25 although both are equally efficient for δ > 0. .  The MTSD-TCC is equally efficient to TCC except δ = 0.25 at n = 1000. TCC was more efficient at δ = 0.25 although both are equally efficient for δ > 0. .  For the theoretical case, the MTSD-TCC is equally efficient to the TCC except for δ = 0.25 at α = 5% and α = 10%. The TCC appeared more efficient at δ = 0.25 although both designs are equally efficient for δ > 0.25. The MTSD-TCC is equally efficient to TCC at α = 20%. (cf. .  Both designs are equally efficient for all sample sizes, but in few cases the TCC offers smaller ARL 1 (such as δ = 0.25). In some cases, the MTSD-TCC presented better ARL 1 for small sample sizes. Overall, the MTSD-TCC appeared equally efficient to TCC in most of the cases.

Contamination Effect on MTSD-TCC and TCC
This section deals with the performance of TCC and MTSD-TCC using several trimming level (5%, 10%, 20%, 30% & 40%) when we have contaminated data. For the said purpose, we have generated 100 random observations form a contaminated environment, where 80 observations come from standard normal distribution while 20 are taken from standard exponential distribution. The resulting dataset is given in Table 6.
We have created different useful plots for this dataset. The plots include data plot, histogram of data, and its boxplot (cf. Figure 1). A data plot is a graphical technique for representing a data set. This type of graphs is very useful for quickly derive an understanding which may not have come from lists of values. A histogram is a graphical display of data using bars of different heights. In a histogram, each bar groups numbers into ranges. Taller bars show that more data falls in that range. A histogram displays the shape and spread of continuous sample data. A boxplot is a simple way of representing statistical data on a plot in which a rectangle is drawn to represent the first and third quartiles, usually with a vertical line inside to indicate the median value. The lower and upper quartiles are shown as horizontal lines either side of the rectangle. This type of graphs is used to display patterns of quantitative data. If the data set includes one or more outliers, they are plotted separately as points on the plot. It may be observed from Figure 1(c) there are many outlier in data sets, as indicate by the boxplot.
We have constructed several control charts including TCC, MTSD with 5%, MTSD with 10%, MTSD with 20%, MTSD with 30%, and MTSD with 40% trimming levels (cf. Figure 2). The detection profile of all these control charts is summarized in Table 7. A control chart is a graphical statistical process control (SPC) tool used to determine if a manufacturing or business process is in a state of control. It is mostly designed to monitor process parameters when underlying form of the process distributions are known. It is evident from the results that the proposed MTSD-TCC method offers superior detection ability for different trimming levels as compared to the classical TCC method under the contaminated process setups.

An Application Using Real Data
We now illustrate the performance of our proposed MTSD-TCC control chart and the classical TCC control chart by using an applied example. Rezaie et al. (2006) presented a data set related to the weight (in grams) of rubber edge, a key component that reflects the sound quality of drive unit. Initial descriptive analyses suggested the normality of data and no outlier. The summary statistics for the data is given in Table 8. The control limit values and the number of observations out of control limits for both the TCC and the proposed MTSD-TCC are given in Table 9. Figure 3 shows a control chart showing the sampled data and the control limits for both TCC and proposed MTSD-TCC. Comparing the control limits as shown in Table 9 and Figure 3, it can be seen that, according to TCC, there are 0 observations out of control limits. According to MTSD-TCC (05%), there are 0 observations out of control limits but the points are much closed to the control limits than that for TCC. According to MTSD-TCC (10%), there are 3 observations out of control limits. According to MTSD-TCC (20%), there are 7 observations out of control limits. According to MTSD-TCC (30%), there are 14 observations out of control limits and 26 observations out of control limits according to MTSD-TCC (40%). Moreover, the width of MTSD-TCC is smaller than that of classical TCC. It is evident from the results that MTSD-TCC offers superior detection ability for different trimming levels as compared to TCC. Hence the results are supporting the simulation study.

Concluding Remarks and Recommendations
In this study, a robust control chart as an alternative to Tukey's control chart (TCC) based on the modified trimmed standard deviation (MTSD), namely MTSD-TCC, is proposed. A simulation study has been conducted to compare the performance of our proposed robust control chart (MTSD-TCC) with the TCC based on run length properties (ARL, MDRL, SDRL, EQL). The study revealed that our proposed robust control chart (MTSD-TCC) is quite efficient as compared to the TCC because it has better run length properties (under both in-control and out-of-control situations) for all trimming levels. An application example using real-life data is provided to illustrate the implementation of the proposed robust control chart (MTSD-TCC) which also supported the results of the simulation. The scope of this study may be extended to other control charts for efficient monitoring of location and dispersion parameters. Also, in this study, we have covered normal and contaminated cases, however the case of the non-normal may be a good recommendation and can be cover in some future study. Finally, the proposed robust control chart (MTSD-TCC) is easy to compute. It is not computer intensive and promising, therefore it can be recommended for the practitioners.