Modeling of Statistical Reasoning and Students' Academic Performance Relationship through Partial Least Squares-Structural Equation Model (PLS-SEM)

A strong command of statistical reasoning will enhance academic performance for students and thus support their future outlook in scientific research activities, especially in the pursuit of higher education. This research seeks to remedy the contention that statistical reasoning leads to the academic success of students by means of structural equation model. In this research, 99 undergraduate students from the Sultan Idris Education University, Malaysia (UPSI) Mathematics Education Degree (BEd Maths) program were engaged in a purposive sampling. The statistical reasoning model and its relationship with the academic performance of the students were analyzed using the Partial Least Squares-Structural Equation Model (PLS-SEM) technique, since the size of the sample was too small to employ Structural Equation Modelling-Analysis of Moment Structure (SEM-AMOS). The findings revealed that empirical evidence and in line with preceding findings and theoretical context improved the established model. It emerged from the analysis that all of the relationships within the model established were significant. A part of the statistical reasoning and students’ academic performance model, the research also successfully affirmed all of the indicator variables described in the statistical reasoning constructs by utilizing PLS-SEM. In conclusions, the relationship between statistical reasoning and academic performance of students was expressed not only in lower order components in PLS-SEM, but also as a hierarchical component model in which statistical reasoning contributes to the academic performance of students and was advocated by empirical data. Therefore, potential research should also concentrate on other reasoning skills which can lead to the academic success of students.


Introduction
Statistics is a critical subject in tertiary education. A comprehensive statistical command will assist in conducting scientific research. Latest works in and the various forms of statistical reasoning, like those of reasoning about variance, distribution, and sampling distributions, have provided valuable ideas for the growth phase of acquiring statistical reasoning skills from a student [1]. According to [2] statistical reasoning and the associated principles of statistical thinking understanding are at the core of the educational statistics network's attention. Much work has concentrated on determining subsequent, hierarchically organized stages of reasoning development through qualitative methods of analysis [1].
[3] stated that the teaching and learning of statistical reasoning is important because there is research that indicates that statistical reasoning affects the success of students. As noted by [4], reasoning includes cognitive processes that transform data bits and bytes toward usable knowledge whereas a conclusion may be drawn. Through a theoretical standpoint, reasoning is perceived to be a conceptual mechanism for deriving inferences or assumptions through facts. Statistical reasoning was described by [5] as how persons reason with statistical concepts and make logical sense of the facts. Statistical reasoning is an important cognitive skill to learn, which is linked to the students' content knowledge [6]. There were many research on statistical learning in the contexts of teaching and learning. Nevertheless, in the context of designing a relationship model with regard to the structural through Partial Least Squares-Structural Equation Model (PLS-SEM) model, there are a lack of research on the relationship concerning statistical reasoning and the performance of the students. This research attempts to establish the relationship between the statistical reasoning and academic performance of the students. In addition to determining the relationship, this research will develop the model in the contexts of the relationship between students' statistical reasoning and academic performance through Partial Least Squares-Structural equation model (PLS-SEM).

Propose of the Study
The research delves theoretical underpinnings on statistical reasoning and students' performance in order to establish the relationship between statistical reasoning and students' academic performance in the context of structural equation model (SEM). As stated by [7], the statistical reasoning and prior mathematical knowledge predicted statistical performance significantly. Whereas [2] claimed that statistical reasoning and the associated principles of statistical thinking and understanding are the priorities of the education statistical society. In the sense, the development of students' academic performance must take into consideration of statistical reasoning. Therefore, the purpose of this study is, therefore, to develop and examine a structural model that depicts the relationship involving statistical reasoning and students' academic performance.

Material and Methods
The research population was comprised of the bachelor of education in Mathematics students of the Sultan Idris University, Malaysia. 99 students were selected at random and were divide into three groups. The technique of purposive sampling was engaged in this research. In the appraisal of statistical reasoning, a special test was performed to each selected group to assess their level of statistical reasoning by means of a test score. Students' Cumulative Grade Point Average (CGPA), two major subjects on statistics and students' co-curriculum activity (MYcat) were used to indicate the students' academic performance. The answer resources were evaluated by utilizing the SmartPLS 3.0 software.

Statistical Reasoning and Students' Academic Performance Model
Several studies connect statistical reasoning and academic performance of students. [8] showed that statistical reasoning is a critical factor in the achievement of students in statistics. [9] revealed that excellent performance in statistics not only motivates students, but also strengthens academic achievement beyond standards.
[10] stressed there is a strong positive link involving the students' statistical reasoning and evolutionary knowledge. Therefore the establishment of the model involving statistical reasoning and students' academic performance relationship are based on [8], [9] and [10].
Statistical reasoning instrument used in this research was adapted from Statistical Reasoning Assessment (SRA) [11]. The components of statistical reasoning assessed in the SRA include reasoning on data, presentation of data, statistical measures, uncertainty, samples and association. Meanwhile the latent variable of Students' Academic Performance comprised of students' CGPA, My Co-curriculum Activity Transcript (MyCAT), stat 1 and stat 2. The descriptions of statistical reasoning and students' academic performance latent variables were depicted in Table 1. While the basic structural equation model was illustrated in Figure 1.

PLS-SEM Model
The research utilized PLS-SEM to examine the developed model. Two distinctive steps were included in the development of the PLS-SEM model. Throughout the first step an analysis was carried out of the internal equation (measurement model), of the latent variable features and measurement items representing them. In the next step, the external equation (structural model) was

Measurement and Structural Model of the Statistical Reasoning and Students' Academic
The model for the research was composed of measurement and structural model. In this analysis, there were eight measurement models represented the relationship involving latent and indicator variables. The measurement models consisted of the indicator variables (SR11, SR12, SR21, SR22, SR31, SR32, SR41, SR42, SR43, SR44, SR51, SR52, SR61, SR62, SR71, SR72) composted in SR1, SR2, SR3, SR4, SR5, SR6, SR7 and the indicator variables (SAP1, SAP2, SAP3, SAP4) composted in SAP latent variable are depicted in Figure 2. While the path diagram connecting SR and SAP was the structural model.

Reliability and Validity of the Measurement Model
This section addressed the reliability and validity of the constructs and the evaluation of each measurement model. Four assessment engaged as stated by [13] in evaluating of the measurement model were Alpha Cronbach (α), Rho_A and Composite Reliability (CR)) for internal consistency; outer loading for reliability; Average Variance Extracted (AVE) for convergent validity; and cross loading and Fornell-Larcker criterion for discriminant validity. The internal consistency, reliability, and validity result from PLS-SEM analysis is depicted in Table 2.  The Alpha Cronbach (α) and CR value for SR1, SR2, SR3, SR4, SR5, SR6, SR7, SR and SAP constructs greater than 0.70 as displayed in Table 2. For this analysis, the indicator variables are adequate to measure the corresponding constructs in each model. Furthermore, the outer loadings were greater than 0.70 that indicates the reliability each indicator variables are adequate to reflect the corresponding constructs as stated by [14]. In additament, the AVE indicates the convergent validity of SR and SAP constructs. These AVE values higher than 0.50 mean that the validity of each construct has been achieved as noted by [14]. Whilst Table 3 demonstrates the cross loading of the indicator variables for discriminating validity test. The result shows that the outer loading of the indicator on the related construct is greater than all its loadings on other constructs (i.e. cross loading), where the discriminating validity attribute indicates to what degree the objects are used to test a construct vary from the other constructs. This reveals that the indicator variables in SR1, SR2, SR3, SR4, SR5, SR6, SR7, and SAP differs from each other by empirical standards, and therefore the measurement model indicates an adequate discriminating validity. For the Fornel-Larcker criterion, for each construct the square root of the AVE was exceeding the value of the respective coefficient in the corresponding row and column including in Table 3. It can be argued that this analysis obtains the discriminating validity with all of the constructs. through Partial Least Squares-Structural Equation Model (PLS-SEM) Table 3. Fornell-Larcker criterion and cross-loading results of the model constructs In response to tests centered on principles suggested by [12], the research confirms that the measurement model is appropriate.

The Evaluation of Structural Model
In assessing the structural model in PLS-SEM, [12] proposed that the collinearity amongst constructs, the significance of both the path coefficient and the determination coefficient (R 2 ) values, the effect size of f 2 , and the predictive relevance (Q 2 ) should be tested. Nevertheless, the collinearity among constructs and the effect size of f 2 would not be assessed exacerbated by the fact that the structural model included only one exogenous latent construct (SP).

Testing the Significance of the Path Coefficient
The path coefficient was calculated using the bootstrapping method. In this analysis, 500 bootstrapped samples, suggested by [12], were used to run 99 cases of sample size. The results of the PLS-SEM bootstrapping analysis are depicted in Figure 3.
The structural model's evaluation of the significance of relationship reveals that SR and SAP have a positive relationship (β=.930, t(97)=67.189, p<.001). Furthermore, the bootstrapping analysis, which is shown in the Figure 3 confirmed that the path model relationship is significant at p<.001.

The Coefficient of Determination (R 2 )
The SAP amount of variation in representing the model was calculated by determining the significance of R 2 . The analysis indicates that the R 2 for SAP is significant as seen in Figure 3 (R 2 =0.865, t(97)=67.189, p<.001), which implies 86.5% of variation in SAP is explained by SR. It can be inferred that the structural model shows relatively well the amount of variation the exogenous construct (SR) explained.

Blindfolding Technique of the Model Fits' Predictive Relevance of Students' Academic Performance
Meanwhile, the blindfolding technique was used to assess the model fit's predictive relevance (Q 2 ). According to [15], the Q 2 indicates how well the model and parameter estimates reconstruct the indicator variables. As stated by [12], Q 2 was rated higher than zero in terms of predictive significance. In this research, Table 4 shows evidence of the blindfolding of the test of predictive significance. For this analysis, the Q 2 value of cross-validated redundancy and communality is higher than zero. In conclusion, the structural model is able to predict the endogenous construct (SAP), in which, for the endogenous construct (SAP), the SR has predictive relevance.

Conclusions
The research successfully developed and evaluated statistical reasoning model and its relationship with the academic performance of the students. The findings betoke that the statistical reasoning and students' academic performance relationship is statistically significant. These results have been supported by other research, those are [8], [9] and [10]. Instead of the statistical reasoning and the academic performance model of the students, the study also successfully verified all of the indicator variables mentioned in the statistical reasoning and students' academic performance constructs by engaging the PLS-SEM. The findings indicate that empirical evidence and in line with the preceding findings and theoretical context strengthened the established model. Since the analysis is of an exploratory nature due to the limited sample size that prompts the engagement of PLS-SEM instead of SEM-AMOS, further research can be carried out on larger data to enhance the establishment of the model.