Goodness-of-Fit Test Based on Arbitrarily Selected Order Statistics

Checking whether or not the population distribution, from which a random sample is drawn, is a specified distribution is a popular topic in statistical analysis. Such a problem is usually named as goodness-of-fit test. Numerous research papers have been published in this area. The purpose of this short paper is to provide a goodness-of-fit test statistic which works for many kinds of censored data formed by order statistics. This is an extension of the work presented in Chen and Ye (2009). The method can be used for censoredsamplesthat are commonly used in reliability analysis including left censored data, right censored data and doubly censored data.


Introduction
The goodness-of-fit test has its long history. The primary goal of the goodness-of-fit test is to check how well a specificstatistical model can fit a given data set. The 2 χ test (Pearson (1900)), Kolmogorov-Smirnov test (Kolmogorov (1933) and Smirnov (1939)), Cramer-von Mises test (Cramer (1928)),and Anderson-Darling test (Anderson and Darling (1952)) are the statistical testspresented in early papers and are still widely used by statistics practitioners. All these test statistics are adopted by most statistical software. The Shapiro and Wilk test (Shapiro and Wilk (1965)) is another commonly used test statistic which works specifically for the normal distribution and lognormal distribution. In the recent years, many research papers have been published in the area of goodness-of-fit test. The power of the goodness-of-fit tests has been compared by many authors. See, for examples, Choulakian, Lockhart and Stephens (1994), and Steele and Chaseling (2006). Chen and Ye (2009) proposed a test statistic for checking whether or not the population distribution, from which a random sample is drawn, is a uniform distribution. It has been shown in that paper that the power of the proposed test in that paper is higher than some of the existing test statistics in some cases, especially for the case that the alternative distributions are V-shaped distributions. In this short paper, the method used in Chen and Ye (2009) will be extended to censored samples. The new test statistic can be used when only some order statistics are available.

Uniformity Test Based on Order Statistics
The purpose of the uniformity test is to check whether or not the population distribution, from which a random sample was drawn, is a uniform distribution on interval [ ] 0.
Here ( ) 0 X is defined as 0, and ( ) 1 n X + is defined as 1. Chen and Ye (2009) discussed the properties of this test statistic. The expected value, variance and the shape of this test statistic are described in that paper. When the population distribution is the same as the specified distribution, the value of ( ) 1 2 , , , n G X X X  should be pretty close to 0. On the other hand, when the population distribution is far away from the specified distribution, the value of ( ) 1 2 , , , n G X X X  should be pretty close to 1. The quantiles of the test statistics were obtained by Monte Carlo simulation and were tabulated. In order to let statistics users find the quantile values easily, the quantile value for different sample sizes and different significance levels are listed in Table 1. The quantiles can be used to conduct goodness-of-fit test discussed in this paper. In fact, the hypothesis of uniformity should be rejected at significant level α if ( ) This paper focuses on the case that the samples are incomplete. In practice, the available samples may be censored ones. For example, in reliability analysis, the statistical analysis is usually based on left censored samples, right censored samples, or doubly censored samples. To fit the need of such kind of applications and even more generalized situations, it is assumed in this paper that only some of the order statistics are available for testing the hypotheses mentioned above. More specifically, it is assumed that only k order statistics for any Here 0 i is defined as 0, 1 k i + is defined as 1 n + , The test statistic defined in (2) possesses some properties. It can be seen that . This is because It can also be seen that Here ( ) , , , , , , For the complete sample case, the critical values computed in Chen and Ye (2009) can be adopted to conduct the statistical test discussed in this section. For such a general case, a simple computer program is needed to run Monte Carlo simulation to find the critical values of the test statistic.

Test for General Distributions
Now suppose 1 2 , , , n X X X  Here ( ) 0 X is defined as −∞ 0, and ( ) for any 1  , , , k G i i i α −  can be obtained using Monte Carlo simulation for any combination of 1 2 , , , k i i i  and α .
For example, suppose the test mentioned above is used to check if the population distribution of a data set is a three-parameter Weibull distribution with location parameter 0 µ , shape parameter 0 β and scale parameter 0 η . The cumulative distribution function of the three-parameter Weibull distribution is ( ) Then the formula to calculate the value of the test statistic defined in (3) is ( )