Comparison of Parameter Estimators for Generalized Pareto Distribution under Peak over Threshold

The study focused on the Generalized Pareto Distribution (GPD) under the Peak Over Threshold approach (POT). Twenty-one estimation methods were considered for extreme value modeling and their performances were compared. Our goal is to identify the best method in various conditions by the use of a systematic simulation study. Some other estimators which were initially not created under the POT framework (NON-POT) were also compared concurrently with the ones under the POT framework. The simulation results under varying shape parameters showed the Zhang Estimator as "best" in performance for NON-POT in estimating both the shape and scale parameter for heavy-tailed cases. In the POT framework, the Zhang Estimator again performed "best" in estimating very heavy tails for the shape and very short tails for the scale regardless of the value of the scale parameter. When varying sample size, under the NON-POT framework, the Zhang estimator performed as “best” heavy-tailed whiles for the POT framework, the Pickands Estimator was "best" in performance at estimating the shape parameter for large sample sizes and the Zhang, small sample sizes.


Introduction
Parameter estimation is a significant and fundamental technique in statistical model building. A wrong choice in the estimation procedure would seriously affect the inferences drawn in the study, hence, making researchers more cautious in estimation. The three parameter Generalized Pareto Distribution (GPD) (location ( ) , scale ( ), and shape ( )) is also prone to misspecification. However, the GPD is sometimes specified by only and instead of the three parameters [1]. In the last four decades, a wide range of methods have been proposed to estimate these two parameters [2,3]. The most commonly used estimators are the Maximum Likelihood Method (MLE), the Probability Weighted Method (PWM), and the Method of Moments (MOM) [4]. However, these estimators all have their pitfalls.
De Zea Bermudez [4] posits that, the PWM method performs well for shape parameter between 0 ≤ ξ ≤ 1 but performs very good when ξ ≤ 0.5 . Though, this method performs well in 0, 1 interval, the asymptotic properties cannot be derived when ξ > 0.5. Castillo [5] also showed that PWM performs well in small samples for ξ ≤ 0.5 and research by Hosking [6] indicated that the method of moments (MOM) performs poorly when the shape parameter is greater than 1. This is due to the mean and variance not existing for values greater than 1.
In a review of GPD estimators by Deidda [7], the MLE which is based on the maximization of the likelihood function improves asymptotically and best when sample size is large. Macklay [8] and Giles [9] in their study further showed that the MLE in comparison to MOM and PWMs leads to biased estimates when sample size is small. In addition to that, Dupuis [10] and Ashkar [11] discussed that these estimators in some cases produce estimates, which are inconsistent with observed data. That is, there are samples for which is negative and Z n > − ̂ (where ̂ is the estimated scale parameter and ̂ is the estimated shape parameter), violating the definition of Generalized Pareto Distribution. They showed that when = −0.5 about 30% of samples results in non-feasible parameter estimates, but the occurrence decreases with increasing . To fix these problems Zhang [12] and del Castillo [13] suggested a likelihood-based method of estimation which is computationally easy. Zhang [14] and Zhang [15] based their estimators on likelihood moment and Bayesian methodology. Additionally, a nonlinear least square (NLS) approach was suggested by Song [16] to estimate both and of which the authors claimed that the NLS performs better than other estimators. However, their comparison was made unfairly because all the other approaches used in their work were based on the POT system, whereas the NLS was not though they claim of using the POT framework [17]. Park [17] then modified this approach by providing weights for the NLS.
Hence, there have been various new estimators proposed for estimating the parameters of the GPD. An extensive discussion of the various proposed methods have been presented by de Zea Bermudez [4]; Macklay [8]. Although de Zea Bermudez [4] gives some guidance on the relative merits of the a number of methods yet no quantitative comparison of the performances by means of a simulation study was made. Though Ashkar [11]; Deidda [7]; Macklay [8]; Kang [1] have compared the performance of estimators through simulation studies, however, few estimators were used for comparison, limited range of sample sizes, or one-sided GPD shape parameters. Most often, the performance of the GPD parameter estimators depends on both the sample size and the value of the GPD shape parameter [18,8,1]. Therefore, estimators that perform well in some situations may perform poorly in others. This study seeks to make a systematic comparison of twenty-one conventional estimators for estimating the parameters of the GPD over a wide range of sample sizes with varying shape parameters and scale parameters and make practical recommendations for the most suitable methods for estimation through simulation studies.

Methods and Materials
This section presents the theoretical review of statistical tools used in the study.

Generalized Pareto Distribution (GPD)
GPD is one of the fundamental distributions in extreme value modeling with distribution function defined as follows Let Z be a random variable such that where the location, scale, and shape parameters as μ (μ ∈ R), σ (σ > 0), and ξ (ξ ∈ R) respectively. Based on the value of the shape parameter, the distribution can be divided into three groups, thus F μ,σ,ξ (z) is short-tailed when ξ < 0 , medium tailed when ξ = 0, and lastly heavy-tailed when ξ > 0.

Peak Over Threshold (POT)
Let 1 > 2 > ⋯ > be an ordered sequence of independent and identically distributed (iid) random variables with common continuous distribution function and be the number of observations over a threshold . The excess distribution above a certain threshold for a random variable is defined as for 0 ≤ z < z F − u where z F is the right endpoint of F. The variable z represents the exceedances over u. The excess distribution function can also be written as

The Root Mean Square Error (RMSE)
This error metric measure is the square root of the mean squared error which measures the differences between the estimate and the actual or true value Where a is the actual value and ̂ is the estimated value of the parameters ( )

The Mean Square Error (MSE)
The mean square error is a measure of the mean square deviation.  Table 1 presents parameter framework used by some researchers. Generally, it can be seen that most of these authors used a location parameter of "0", a scale parameter of "1" and "10"

Simulation Procedure
This section gives the algorithm for comparison on the performance of twenty-one estimators in parameter estimation with (POT Framework) and without the POT (NON-POT Framework) method. This is due to the fact that many authors have studied the parameter estimation for GPD not following the POT method [1]. The value of the shape parameter to be used are = −3, −2, −1, −0.5, 0, 1, 2, 3, 5 and that of the scale parameter are 1, 10. Sample sizes to be considered ranges from small to large, that is = 10, 20, 30, 50, 100, 200, 500, 1000, thus for NON-POT framework.
For POT framework with a quantile; = 0.9 , the sample sizes to be considered are 100, 200, 300, 500, 1000, 2000, 5000 and 10000 whiles with quantile; = 0.95, the sample sizes to be considered are 200, 400, 600, 1000, 2000, 4000, 10000 and 20000. This is chosen in order to have the same number of observations as the NON-POT framework.

NON-POT Framework
The following algorithm is to be used for the simulation procedure not following the POT framework thus using all observations. I.
Generate an number of observations following GPD with the known shape and scale parameter. II. Using all the observations generated, estimate each of the shape and scale parameters with 21 different estimation techniques. III. Repeat step I and II 10,000 times IV. Compute and record the performance metric of each estimator of which the minimum value means better performance. V. Steps I-IV are to be repeated for all sample sizes.
VI. To find the best estimator for varying the shape parameter at a particular value of scale parameter, compute the average value of both and for each performance metric across all sample sizes using: where is the value of ̂ for each method, is the value of ̂ for each method and is the number of sample size.
VII. To find the best estimator for varying sample sizes, compute the average for each performance metric across all values of the shape parameter at a particular value of scale parameter.
where is the total number of the shape parameter

POT Framework
The following algorithm is also to be used for the simulation procedure under the POT framework thus using observations above a certain threshold. I.
Generate an number of observations following GPD. II. Take the 100q th quantile as a threshold value. Thus using = 0.9 0.95. III. Using all the observations above the threshold, estimate the shape and scale parameters with 21 different estimation techniques. IV. Repeat steps I, II and III 10,000 times V. Compute their performance metric for each of the estimators of which the minimum value means better performance. VI. Steps I-V are to be repeated for all the sample sizes. VII. To find the best estimator for varying the shape parameter at a particular value of scale parameter, compute the average value of both and for each performance metric across all sample sizes using "(8)"

Results and Discussions
The main focus of this study is to compare the performance of 21 estimation techniques under different conditions. These conditions considered are 1. Use of 9 different values of the shape parameter with 2 values of the scale parameter 2. Use of 8 different sample sizes again with 2 values of the scale parameter. Table 2 shows the 21 parameters used in the study. From the table, item 10 to item 16 (AD, ADR, AD2R, ADL, AD2L, CM, and KS) are the Maximum Goodness of Fit (MGF) estimators used by Luceño [24] R codes for MLEn; Zhang; NLS-S; NLS-P and WNLS-P are obtained from Kang [1] while the rest can be found in the R package POT created and maintained by Ribatet [25].

Simulated Results for NON-POT Framework
Tables 3 shows the "best" and "worst" performing methods for the NON-POT framework of 21 estimators with respect to varying shape parameters when = 1 10. With regards to short-tailed cases (i.e. < 0), the CM, ADR, MLE, and the Zhang estimators performed very well in the estimating concurrently both the shape and scale parameters at = −3, −2, −1 − 0.5 for = 1 10 respectively. However, the MLEn estimator performed very poorly in the estimating concurrently both the shape and scale parameters for = 1 10 respectively.
More so, for a medium tailed case, that is with = 0, at = 1 the LME performed well in estimating the shape parameter whiles Zhang's performance was good in estimating the scale. Whereas = 10 , the Zhang estimator was "best" for shape and the KS was "best" for scale. Yet, the MLEn estimator was still "worst" in performance for simultaneously estimating both the shape and scale parameters for = 1 10 respectively. Also, for heavy-tailed cases (i.e. > 0), the Zhang estimator outperformed all other estimators for both the shape and scale parameter except at = 5 with = 10 where the MLEn performed "best". Nonetheless, at both when = 1 10 , the AD2L estimator performed "worse" than all other estimators for both the shape and scale parameters except at = 5 where the MDPD and all the MGF estimators performance were "worst".    Tables 4 shows results for the POT framework of which the 90th quantile of the observations is used in the estimation with = 1 and = 10 respectively. This table gives the "best" and "worst" performing estimators for all the shape and the scale parameters when the values of the shape parameter vary.
For < 0 at = 1, the Pickands and the ADL was "best" in performance at when = −3 − 2 respectively whiles AD2R was "best" for = −1 and −0.5. However, at = 10 for estimating the shape parameter Pickands, CM, MLE, AD2R were "best" in performance with increasing value of . Whereas for estimating the scale parameter at = −3 − 2 the Zhang method outperformed all other methods while the AD2R and the MPLE were "best" at = −1 − 0.5 respectively. With regards to "worst" performance, the MLEn estimator performance was poor in estimating both the shape and scale parameters for = 1 and = 10. However, when = 10 , in estimating the scale, the Zhang performed "worse" at = −1 − 0.5. Furthermore, with = 0, in estimating both the shape and scale parameter, the LME was "best" for = 1 while for = 10 , the KS estimator was "best" as compared to all other estimating methods. Pertaining to "worst" performance, the MLEn and the Zhang estimators were still "worst" in performance for estimating both the shape and scale respectively at = 1 10. For > 0 , with = 1 in estimating for the shape parameter, the LME and Median method were "best" in performance when = 1 and 2 respectively whilst the Zhang estimator was "best" in estimating values of at 3 and 5. The MLEn method outperforms all other estimators in estimating the scale parameter. With = 10 in estimating for the shape parameter, the PWMU and Pickands were "best" in performance when = 1 and 2 respectively, whilst the Zhang estimator was best for values of 3 and 5. In estimating for the scale, the LME estimator was relatively excellent when = 1, 2, 3 whereas the MLEn method was good in estimation when = 5 . In terms of "worst" performance, the AD2L estimator performed "worse" than all other estimators for the shape whilst for the scale KS, AD2L and AD2R performed poorly. At = 10 , the MLEn and AD2L were the "worst" in estimating the shape. For scale, both the KS and AD2R were "worst" in performance.
In estimating using the POT framework of 95 th quantile as the threshold value, Table 5 gives the summary of the "best" and "worst" performing estimators in that regard.
For short-tailed cases, = 1 the Pickands, PWMB, ADR and MDPD were "best" in performance with minimal error metric. Pickands, CM, MPLE, and ADR performed well in estimating each of the shapes whereas the Zhang, Moment and MPLE were "best" in estimating the scale at = 10.
For the medium-tailed case, the LME and the KS outperformed all other estimators for both the shape and the scale at = 1 and = 10 respectively.
For heavy-tailed, when estimating the scale, the MLEn was good at = 1 whilst for = 10 the LME was good in performance. With regards to the shape, the Zhang performed better compared to all other estimators used in the study regardless of the value of the scale parameter.
For both short-tailed and medium tailed cases in terms of "worst" performing estimators, performance was the same when threshold value was = 0.9 at both = 1 10 respectively. However, when > 0 at = 1 10, the AD2L and the MLEn estimators performed "worst" than all other estimators for the shape whilst for the scale KS, and AD2R performed poorly.

Simulated Results for NON-POT Framework
In estimating shape at both scale values of 1 and 10, the Zhang estimator outperforms all the other estimators from small sample sizes to larger sample sizes (i.e. 20 ≤ ≤ 200 ) but when sample size was 500 and 1000 , the Pickands estimator performed better. When checking "worst" performing estimating methods, the AD2L estimator performed "worse" than all the other estimators for all sample sizes.
With regards to the estimation of the scale, the Zhang estimator outperforms all the other estimators at all sample sizes varying from small to large (i.e. 20≤ ≤ 1000) whilst the MDPD and all the MGF estimators all gave the same estimating values for the error metric measure indicating "worst" performance. These results are presented in Table 6.  Table 7 shows results for the POT framework of which the 90 th quantile of the observations is used in the estimation with = 1 and = 10 respectively. This table gives both the "best" and "worst" performing estimators for all the shape and the scale parameters when varying sample size.
It is obvious from the table that regardless of the value of σ, the Zhang method was "best" in performance for estimating the shape when sample size was small (i.e. n = 10 to 30). Nonetheless, when sample size(n) was from 50 to 1000, the Pickands performed better than other methods. Whereas, the AD2L estimator performed "worse" than all the other estimators for all sample sizes (n), except for = 10 at = 10 when the MLEn performed "worse".
In estimating the scale parameter, at σ = 1, the "best" performing method for varying sample sizes(n) was the MLEn at 20 ≤ ≤ 1000 while the MDPD and the AD2R performed poorly. For σ = 10, the LME method performed "best" for nearly all sample sizes. Whilst all the MGF and MDPD estimators all gave the same estimating values for the error metric measure indicating "worst" performance for sample sizes(n) 10 to 200. At sample size(n) 500 and 1000 the AD2R gave the "worst" performance.
At a threshold value of the 95 th quantile, with varying sample sizes, Table 8 gives the "best" and "worst" performing methods. It's obvious from the table that the MLEn outperforms all the other estimators in estimating the scale parameter (i.e. = 1) with small (20, 30) and even large sample sizes (50 to 1000). For = 10, the LME outperformed all other methods for nearly all sample sizes(n). However, the MDPD and the AD2R performed poorly regardless of the value of .In estimating for the shape parameter at both = 1 and = 10, the Zhang estimator achieved good performance with small sample sizes(n) (i.e. 10 30) and when sample sizes are large (i.e. 50 1000), the Pickands estimator showed a very good performance. Nonetheless, at = 1 , the AD2L estimator performed "worse" than all the other estimators for all sample sizes, except for = 10 when the MLEn performed "worse". More so, at = 10 the MLEn "worst" in performance for all sample sizes.

The Performance Score of Conventional Methods of the GPD
The overall performance of an estimator with respect varying both the shape parameter and sample size was computed using a performance score and is given by: where is the number of times an estimator recorded the minimum error metric and p is the total number of values for shape parameter or sample size.

The Performance Score of Conventional Methods of the GPD Varying Shape Parameter
In Table 9 the performance score for the estimators by varying shape parameter for both the NON-POT and the POT framework are presented. All estimators reported in Table 9 are estimators that had at least a minimum error metric in comparison to other estimators on the scenario of the shape parameter being varied.
For the NON-POT framework, the Zhang estimator is ranked as the overall best method for estimating both the shape and scale parameter. Also, for the POT framework, in estimating the shape parameter, the Zhang estimator is again the overall best whiles the MLEn is the overall best in estimating the scale parameter.

The Performance Score of Conventional Methods of the GPD Varying Sample Size
The performance score for the estimators by varying sample size for both the NON-POT and the POT framework is shown in Table 10. Again, all estimators reported in Table 10 are also estimators that had at least a minimum error metric in comparison to other estimators on the scenario of the sample size being varied. In this regard also, the Zhang estimator was ranked as the overall best method for estimating both the shape and scale parameter for the NON-POT framework. Under the POT framework, in estimating the shape parameter, the PICKANDS was ranked overall "best" while MLEn and the LME showed the highest performance score in estimating the scale parameter.

Conclusions
Traditional methods of parameters estimation in the GPD have some identified defects. The results presented in the previous section have compared the accuracy of estimators under the assumption that the estimators are to perform better with increasing sample size and different shape parameters. Generally, none of the estimators consistently performed better with increasing shape parameters. Also, there is a large bias recorded when estimating the scale parameter with most of the estimators.
It was further observed from the results that with regards to the scale parameter being one (1), the existence of the MDPD estimator is restricted to = −3 for all sample sizes under the POT framework with q = 0.9 and 0.95. Thus, it is unable to estimate the value of shape parameter when = −3 . The MLE method with = 10 , faces a convergence problem when = 5 under the POT framework.
For NON-POT, the Zhang Estimator was "best" in performance in estimating both the shape and scale parameter for most heavy-tailed cases. Also, for the POT framework with q = 0.9 and q = 0.95, the Zhang Estimator was "best" in performance for estimating very heavy tails (i.e. = 3, 5) for the shape and very short tails (i.e. = −3, −2) for the scale regardless of the value of . Hence, the Zhang estimator was then ranked overall "best" in estimating both the shape and scale parameter for NON-POT case whilst again, the Zhang estimator the overall best in estimating the shape and the MLEn is the overall best in estimating the scale parameter for case of the POT framework.
Additionally, in estimating the shape parameter, under the NON-POT framework, the Zhang estimator performed "best" in all cases of scale parameter with increasing sample size. Under the POT framework, with q=0.9 and q=0.95, the Pickands Estimator was "best" in performance at estimating the shape parameter for large sample sizes while Zhang was for small sizes. However, in terms of overall "best" both the MLEn and the LME were ranked "best" in estimating the scale for the POT scenario.
When varying shape parameters, the MLEn was "worst in estimating both the shape and scale parameter for most short and medium tailed cases whereas for some heavy tails the AD2L and KS were "worst" in performance. Whilst when varying sample sizes, the MDPD, and the MGF estimators were "worst" in performance with at times having the same estimates.
In conclusion, the performance of estimators depends on sample size, the value of the shape parameter and also the value of scale parameter.