Efficiency of Parameter Estimator of Various Resampling Methods on WarpPLS Analysis

WarpPLS analysis has three algorithms, namely the outer model parameter estimation algorithm, the inner model, and the hypothesis testing algorithm which consists of several choices of resampling methods namely Stable1, Stable2, Stable3, Bootstrap, Jackknife, and Blindfolding. The purpose of this study is to apply the WarpPLS analysis by comparing the six resampling methods based on the relative efficiency of the parameter estimates in the six methods. This study uses secondary data from the questionnaire with 1 variable being formative and 2 variables being reflective. Secondary data for the Infrastructure Service Satisfaction Index (IKLI) were obtained from the Study Report on the Regional Development Planning for Economic Growth and the Malang City Gini Index in 2018, while secondary data for the Social Capital Index (IMS) and Community Development Index (IPMas) were obtained from the Research Report on Performance Indicators Regional Human Development Index and Poverty Rate of Malang City in 2018. The results of this study indicate that based on two criteria used, namely the calculation of relative efficiency and measure of fit as a model good, it can be concluded that the Jackknife resampling method is the most efficient, followed with the Stable1, Bootstrap, Stable3, Stable2, and Blindfolding methods.


Introduction
Structural Equation Modeling (SEM) is an analysis to obtain data and relationships between latent variables that are carried out simultaneously [1]. PLS is a method that is more complicated than SEM because it can be applied to the reflective indicator model and the formative indicator model. The WarpPLS method is the development of Partial Least Square (PLS) analysis which can identify and predict relationships between linear and non-linear latent variables. WarpPLS analysis has three algorithms: the outer model parameter estimation algorithm, the inner model, and the hypothesis testing algorithm [2].
The hypothesis testing algorithm in WarpPLS uses a resampling algorithm that consists of various resampling methods: Stable1, Stable2, Stable3, Bootstrap, Jackknife, and Blindfolding. Stable1, Stable2, and Stable3 are the latest resampling methods in WarpPLS analysis. This method uses a quasi-parametric approach or method, which is a p-value approximated by the average value.
In the Bootstrap resampling method, the resampling is done with a certain sample size and repeated by 100 times to achieve convergence. In the Jackknife method, resampling is done by removing one line and repeating until the last sample. Whereas the Blindfolding method is more similar to the Jackknife method, but the first row data is replaced by the average of each column (variable) and then continued until the last row. Of the six resampling methods, the most commonly used is the Bootstrap resampling method.
Department of Statistics, Faculty of Mathematics and Natural Sciences, Brawijaya University, Indonesia Based on previous research, the Blindfolding resampling method is more efficient than the Bootstrap resampling method [3]. Other research states that the Stable3 resampling method which becomes the default setting in the WarpPLS program package is also more efficient than the Bootstrap and Stable2 resampling methods [4]. Therefore, this study wants to find out which resampling method is the most efficient of the six resampling methods found in the WarpPLS analysis: Stable1, Stable2, Stable3, Bootstrap, Jackknife, and Blindfolding resampling methods, especially on the data used, namely the Satisfaction Index Infrastructure Services, Social Capital Index, and Malang City Community Development Index.

Materials and Methods
Structural equation modeling or SEM is a technique used to describe the simultaneous relationship of linear relations between observational variables, which also involves latent variables that cannot be measured directly [2]. SEM analysis initially combines a system of simultaneous equations, path analysis, or regression analysis with factor analysis. Factor analysis is used as a method for obtaining latent variable data. The process of estimating parameters and testing is based on the concept of a variance-covariance matrix, so it is often referred to as covariance-based SEM [1].
According to [1], the WarpPLS analysis is a development of the PLS analysis. PLS model was developed as an alternative when the model design has a weak or undiscovered theory, or some indicators could not be measured by reflective measurements so that it was formative [5]. PLS is a powerful method because it does not require a lot of assumptions, and the sample size can be small or large. Besides to use as a confirmation of theory (hypothesis testing), PLS can also be used to build relationships that do not have a theoretical basis or to test propositions.
If the structural model to be analyzed is not recursive, and latent variables have formative, reflective, or mixed indicators, one of the appropriate methods to be applied is PLS [6]. PLS can avoid indeterminacy factor, namely the presence of more than one factor contained in a set of indicators on a variable. Because formative indicators do not require common factors, so composite latent variables will always be obtained [1].
WarpPLS is a method and software of program package application developed by Ned Kock [4] to analyze variant or PLS-based SEM models. It is not only used for non-recursive models but also non-linear model analysis (Warp2 and Warp3).
According to [1], the structural model in WarpPLS consists of two things: 1) Outer model is the collection of latent variable data sourced from its indicator, consisting of reflective or formative indicator models. 2) Inner models are the relationship model between recursive and not recursive latent variables.

WarpPLS Method
WarpPLS analysis has three-parameter estimation algorithms: outer model estimation algorithm, inner model, and hypothesis testing algorithm [2]. The estimation of the outer model parameter algorithm is the calculation process to produce latent variable data sourced from data items, indicators, or dimensions. While the inner model estimation algorithm is the method and process of path coefficient calculation, that is the coefficient of the influence of explanatory/predictor variables on the response/dependent variable. In the hypothesis testing algorithm, WarpPLS analysis uses a resampling algorithm [7]: 1) Stable1 is an approach or a method of quasi parametric, and the p-value is approached by the grade point average. 2) Stable 2 produces a consistent guess, through the Bootstrap resampling algorithm. 3) Stable 3 produces consistent allegations, through the Bootstrap resampling algorithm. 4) Bootstrap is the resampling with a certain sample size (equal or smaller than the original sample) and repeated by 100 times (Bootstrap samples) to achieve convergence can see in Figure 1.
Sample Bootstrap B 100 (n 100 =108) Original Sample (n=108) Sample Jackknife J 1 (n 1 =107) Sample Jackknife J 2 (n 2 =107) Sample Jackknife J 108 (n 108 =107) Original Sample (n=108) Figure 2. Jackknife Resampling Illustration 6) Blindfolding is similar to Jackknife but the first row data is replaced by the average of each column (variable) and continued by the second until the last row [8]. As with the Bootstrap method, it will converge to 100 repetitions following the Figure 3. Original Sample (n=108)

Hypothesis Testing (Resampling)
Hypothesis testing in the WarpPLS analysis is done by the resampling method. It guarantees data to be distribution-free. This study uses six types of resampling: Stable1, Stable2, Stable3, Bootstrap, Jackknife, and Blindfolding.
Calculation of standard error for resampling parameters:    Testing was done using a t-test with the criteria; if the p-value ≤ 0.1 (alpha 10%), the result will be significant.

Assumption on WarpPLS
Data distribution assumptions are not needed in the WarpPLS analysis, meaning the data do not have to meet normal assumptions. This may be caused by the fact that WarpPLS, which is the development of PLS, is a powerful method and the required sample size can be large or small. Data assumptions on WarpPLS have been fulfilled in the hypothesis testing process that involves a resampling approach. By taking at least 100 samples, the central limit theory states that if a population has a median  and variance 2  , median distribution of sample will be closer to the normal distribution with the median  and variance 2 n  , in which the greater the n value, the faster the example distribution get fulfilled [10].
The important WarpPLS assumption is the linearity assumption. This determines the algorithm used in WarpPLS modeling between linear and non-linear algorithms. WarpPLS can be used when linearity assumptions are met or not. Linearity test is done by using the method of Regression Specification Error Test (RESET). In its approach, Ramsey RESET uses OLS (Ordinary Least Square) to minimize the number of errors that are squared from each observation [11].

The Sample Size for WarpPLS
The size of the sample in a study should be determined by using a formula. The formula is adjusted to the sampling technique used and the availability of information. The accuracy of the formula and the circumstances will minimize the total error (the combination of sampling error and non-sampling error). If the information to determine the sample size is not available, it can use a table or rule of the thumb [1]. Some examples of the rule of the thumb are: 1. Ten times the number of variables (remember WarpPLS is part of the multivariate analysis); 2. Ten times the number of formative indicators (ignoring reflective indicators); 3. Ten times the number of structural paths in the inner model.

Reflective Indicator Model
Formative or reflective models can be determined from "operational definition." Based on the definition of the operational variable, it can be precisely determined either a formative or reflective model formed. In general, there is an opinion that the latent variables with formative indicator models are sourced from indicators whose data are quantitative, for example, community welfare with indicators; per capita income, length of education, and life expectancy, where all data indicators are quantitative [13].

Relative Efficiency
The efficiency of the two estimators can be compared using relative efficiency. The efficiency of the two estimators i  relative to * i  can be defined as follows [13]: where: resampling method The efficiency of the two estimators can be seen in Table  1. The ˆi  estimator is as good as the * i  estimator.

The Measure of Fit and Significance in Several Resampling Methods
The measure of fit can be performed on measurement models, structural models, and overall models (overall models). A measure of fit in this measurement model is intended to check (test) whether the research instrument is valid and reliable and to find out how much information can be explained by structural models or relationships between latent variables) from the results of the WarpPLS analysis as well as a measure of combined goodness of fit between measurement models and structural models. In this study, indicators are formative and reflective.
The validity and reliability tests in the WarpPLS analysis can be measured using convergent validity, if the loading value is 0.5 to 0.6 then it can be said to be valid. Validity measurement with discriminant validity, if the average variance extracted (AVE) is greater than the correlation with all other latent variables then it can be said to be valid. The research instrument is said to be reliable if the composite reliability value is greater or equal to 0.7.
A measure of fit in the structural model in the WarpPLS analysis can be measured using the Model Fit and Quality Indices [14]. The criteria used are the rule of thumb, so it should not apply rigidly and absolutely. If there are one or two indicators of the fit and Quality Indices model that is met, of course, the model can still be used.
A measure of fit in the overall model can be seen through the results of hypothesis testing (equation (6)) which shows the significance of the results of the resampling method that has been used.

Infrastructure Service Satisfaction Index
Infrastructure Service Satisfaction Index (ISSI) is a measure used to know the level of community satisfaction about the infrastructure development conducted by the Central Government and Regional Governments. ISSI is expected to be a tool that produces an illustration of the community's perspective objectively, comprehensively, and credibly, both in the aspects of physical development and benefits.
2. Capital Social Index (CSI) [15] summarized the definitions of several figures by explaining that the true identity of social capital is the values and norms held to as a reference for behaving and dealing with other parties that bind to the process of change and community efforts to achieve a goal. These values and elements are manifested in participatory attitudes, mutual attention, mutual giving and receiving, mutual trust, the willingness of the community to be proactive in maintaining values, forming collaborative networks, and creating new ideas, which whole is reinforced by the values and norms that support it.

Community Development Index (CDI)
Community Development Index is a composite index that measures the nature of cooperation, tolerance, and the community's feeling of security. According to Katz in [16] development is a major social change from a particular situation to a situation that is considered more valuable. According to [17], community development is an effort to increase all resources, carried out in a planned and sustainable manner.

Research Methodology
The data used were secondary data obtained from a Likert scale questionnaire containing 3 latent variables: Infrastructure Service Satisfaction Index, Community Development Index, and Social Capital Index of Malang City in 2018. The steps of the parameter estimator efficiency of various resampling methods in warpPLS analysis are as follows:

1) Model Design
The path analysis model in SEM WarpPLS has two relationships: the inner model and the outer model [1]. The inner model is the design of the relationships between theory-based latent variables, empirical research, intuition, and rational research. While the outer model is the specification of the reflective and formative relationship between the latent variable and its indicator.

2) Constructing Path Charts
The result of the structural model (inner model) and the measurement model (outer model) will be more easily understood if they are constructed or expressed in a path diagram. The use of WarpPLS notation in the path diagram is similar to the PLS notation.


Inner model The inner model is the specification of the relationship between latent variables (structural models). The inner model (inner relation) describes the relationship between latent variables. Its equation model can be written as in equation (5) Outer model is the specification of the relationship between latent variables and their indicators. There are two types of models, namely reflective and formative models. This study used a reflective and formative model. Reflective indicator models can be written as in equation (6). Whereas the formative indicator model can be written as in equation (7): x X=λ x + δ (7) where: X : exogenous latent variable ( 1) q  x λ : loading matrix of endogenous latent variable ( ) qr  x : indicator matrix for exogenous latent variable ( 1) r  δ : error vector for endogenous latent variable ( 1) q  q : the number of exogenous latent variables r : the number of exogenous variables indicators.

4) Parameter Estimation
Parameter estimation in WarpPLS is similar to that of PLS, using the least square method (4). Parameter estimation is done by an iteration calculation process that will stop if a convergent condition has been reached. The calculation process is done by three-stage iteration. The first stage is to produce a stable weight estimator by calculating the outside and inside approximation of the latent variable. An estimator to get outside approximation is the inner model estimator and inside approximation (outer model estimator). The second step is to predict the relationship between the path and Ordinary Least Square (OLS). The third stage is to calculate the mean of each indicator with the original data and the weights from the first stage.

5) The goodness of Fit Evaluation
There are 2 evaluation models:

(a) Inner Model
Evaluation of the inner model is done by using the Goodness of Fit criteria, which is an index and measure of the goodness of relationships between the inner model. The goodness of Fit in the WarpPLS analysis is the Model Fit and Quality Indices [1]. The criteria used were as a rule of thumb, meaning if there are one or two indicators not fulfilled, the model still could be used.

(b) Outer Model
Evaluation of the outer model was made to the validity test and instrument reliability check. According to [12], validity testing of WarpPLS was evaluated by a convergent validity test and discriminant validity test based on cross-loading and AVE values. While the reliability check can be evaluated by the composite reliability index.

Assumption Testing of Inner Model Linearity
SEM analysis using the WarpPLS approach does not have strict assumptions. The assumptions only related to the inner model to select the inner model algorithm. Hence, it should be conducted linearity test using the RRT test with the help of software R. The results of the test can be seen in Appendix 3. In summary, it can be shown in Table  2. Y p-value< 0.05. It can be concluded that the relationship between these variables does not meet the linearity assumption, so the warp algorithm was used.

Structural Model Evaluation (Inner Model)
The Inner Model is evaluated by looking at the Goodness of Fit Model value by using the rule of the thumb criteria. The value of model Goodness of Fit can be seen in Table 3 below: Based on Table 3, it can be seen that all Goodness of Fit values have met the acceptance criteria. Hence, it can be explained that the index and measure of the good relationship between the latent variables are acceptable.

Measurement Model Evaluation
The results of the hypothesis of the outer model for each variable with reflective indicators can be seen in Table 4. 0.729 yY   As for the formative variable weighting, its values can be seen in Table 5.

Hypothesis Testing of Inner Model and Outer Model
The result of hypothesis testing on the inner model can be seen in Table 6.  Table 6 shows that the result of direct influence testing between ISSI and SCI has a path coefficient value of 0.429 with a p-value <0.001, while the path coefficient value between ISSI and CDI is 0.147 and its p-value <0.001. Also, the path coefficient value between SCI and CDI is 0.680, and its p-value <0.001. Due to p-values <0.05 so there is a significant direct effect between ISSI on SCI, ISSI on CDI, and SCI on CDI.
The results of the indirect effect between ISSI and CDI trough the SCI had an indirect effect coefficient of 0.292. Because the effect of ISSI on SCI and SCI on CDI is significant, it can be said that there is a significant indirect influence.
The result of the inner model hypothesis testing can be seen in Figure 4.  The model from the calculation result of the inner model is as follows:

Efficiency Test
The results of the standard error value calculation of each resampling as the efficiency test criteria are as follows. Based on the results of relative efficiency testing, it can be seen that the variance values obtained from the Standard Error-values are squared. The Jackknife resampling method produces the smallest variance values of the other 5 methods. Therefore, the Jackknife resampling method is the most efficient in this study. Based on Table 8 it can be seen that the results of the efficiency tests from the 1st to 3rd ER values and from the averages, the two resampling methods are equally efficient. However, when viewed based on its variance value, the Stable1 method has a smaller variance value than the Bootstrap resampling method variance value. Therefore, it can be concluded that the Stable1 resampling method is more efficient than the Bootstrap resampling method. Based on Table 9, it can be seen that the results of the efficiency test from the 1st ER value have been quite consistent between the two resampling methods. So it can be concluded that Jackknife resampling is better than Stable1 resampling. Based on Table 10 it can be seen that the results of the efficiency tests from the 1st to 3rd ER values have been quite consistent between the two resampling methods. So it can be concluded that the Stable1 resampling is better than the Blindfolding resampling.  Table 11, it can be seen that the results of the efficiency tests of the 2nd and 3rd ER values have been quite consistent between the two resampling methods. So it can be concluded that Jackknife resampling is better than Bootstrap resampling.  Table 12, it can be seen that the results of the efficiency test from the 2nd ER value to the 3rd san have been quite consistent between the two resampling methods. So it can be concluded that Bootstrap resampling is better than Blindfolding resampling. Based on Table 13, it can be seen that the results of the efficiency tests of the 2nd and 3rd ER values have been quite consistent between the two resampling methods. So it can be concluded that Jackknife resampling is better than Blindfolding resampling.
Based on all combinations of the two resampling methods, the result is that the Jackknife method is a more efficient resampling method, followed by the Stable1, Bootstrap, Stable2, Stable3, and Blindfolding methods.

Conclusions
It can be concluded that the results of parameter estimation in the WarpPLS analysis using the Stable1, Stable2, Stable3, Bootstrap, Jackknife, and Blindfolding resampling methods produce different relative efficiency values. The Jackknife resampling method produces the lowest variance value than the other five resampling methods.
The result of the measures of fit that showed that there were no differences in the evaluation of structural models and the sixth measurement model of the resampling methods. The value of loadings, indicators weight, and the results of hypothesis testing on the six methods produce the same output. Therefore, based on the two criteria used in this study, namely the calculation of relative efficiency and measure of fit as a model good, it can be concluded that the Jackknife resampling method is the most efficient method than the other five methods.