Development of Problem-Solving Efficacy Scales in Mathematics

The capacity of self-efficacy to promote students’ better performance in mathematics highlights the importance of measuring and ensuring mathematics problem-solving efficacy among students. Despite a number of existing questionnaires on problem-solving efficacy, however, a valid and reliable scale that is customized in the context of Filipino students enrolled in mathematics classes is yet to be developed and validated. Hence, this research is an attempt to facilitate the academic achievement of students enrolled in Mathematics in the Modern World course in Philippine higher education institutions by developing the Mathematics Problem-Solving Efficacy Scales (MPES) using exploratory factor analysis (EFA). This study met all the assumptions (e. g. sample size, ratio of respondents to variables, factorability of data set) and followed the strict procedures of EFA (e. g. parallel analysis, choosing the right rotation). The results partly agreed with Bandura’s self-efficacy theory where mathematics problem-solving efficacy had four factors; namely, verbal persuasions, somatic responses, vicarious experiences, and mastery experiences. However, unlike previous findings, the current study revealed that verbal persuasions (R = 23.65%) and somatic responses (R = 20.0%) explained much greater amount of variance than vicarious (R = 9.78%) and mastery experiences (R = 4.77%). The four-factor solution accounts satisfactorily for 58.19% of the total variance. Similarly, the Cronbach alpha reliability coefficients of the factors ranged from .717 to .925. These findings suggest that the developed MPES consisting of four subscales is both structurally valid and reliable.


Introduction
Self-efficacy is people's belief in their capabilities to produce a desired performance that has a potential effect on their lives [1]. It is people's optimistic self-belief of their competence and chances of accomplishing a task and producing a favorable outcome [2]. Self-efficacy beliefs influence people's choices [3]- [4], the effort they exhibit [4]- [5], and their persistence amidst challenges [4]- [5]. Students with high efficacy beliefs tend to apply strategies that are more effective, react to environmental demands more suitably [3], are more eager to participate in challenging tasks, work harder, and persist longer [4]- [5].
Bandura [1]- [6] theorized that efficacy comes from the following sources: (1) mastery experiences, (2) vicarious experiences, (3) verbal/social persuasions, and (4) somatic reactions. Mastery experiences are the interpreted results of purposive performance. According to Pajares [7], every learning situation would have culminated in "mastery experience" and this gives students the opportunity to apply concepts and prove that they have learned. Research affirms that mastery experience is the strongest source of self-efficacy [1]- [8]. In the context of primary school students, mastery experiences accounted for 36.7% of the total variance in efficacy beliefs of learning performance [8]. But for high school students in a chemistry subject, mastery experiences explained 50% of the variance in chemistry self-efficacy for cognitive skills [9].
Vicarious experience is concerned with the effects of actions of others on individuals' self-beliefs. It is sometimes called modeling which refers to the constructive effects on one's efficacy beliefs when other people succeed [10]. Previous studies used different forms of modeling such as cognitive modeling [11], confident and pessimistic modeling [12], coping and peer mastery modeling [13], and self-modeling [14]- [15] and found that models are an important source of self-efficacy. Similarly, a more recent study showed that vicarious experience is the third of the three significant predictors of self-efficacy [8].
Social persuasion refers to the "social messages" that one receives from others. People who are persuaded verbally tend to achieve or master a task [1]. Positive feedback students received from teachers is a very good source of self-efficacy [16] and the meaningful interactions between the teacher and the students could develop students' self-efficacy beliefs [17]. Verbal persuasion is found as the second strongest predictor of self-efficacy [8]. This capacity of verbal support in shaping a person's self-belief encourages teachers to focus instruction on enhancing students' self-efficacy [18].
Lastly, physiological states refer to anxiety, arousal, fatigue, mood, and stress. According to Bandura [1], people rely partly on their somatic and emotional states in judging their capabilities. Pajares [7] argues stress, anxiety, worry, and fear have destructive effects on self-efficacy; and eventually, these emotional states make people believe that they can never be successful to perform the feared task. Some research found that 'physiological states' is a significant predictor of self-efficacy [19]- [20]- [21] while others reported that it has the weakest correlation with self-efficacy [8]; and is not a significant self-efficacy predictor [8]- [22]- [23].
A growing body of research supports the positive effect of self-efficacy on academic performance particularly in the field of mathematics. Research found that mathematics self-efficacy directly influences mathematics performance [4]- [24]- [25]- [26]- [27]- [28]- [29], mediates epistemological beliefs and mathematics achievement [30], positively influences task value and mastery goals but is negatively associated with performance-avoidance [25], diminishes achievement gaps [26], and predicts graduation when prior performance and aptitude are controlled [31]. Research has affirmed that academic self-efficacy is the strongest lone predictor of academic performance and achievement [32]- [33] i. e. students with higher mathematics self-efficacy tend to have higher mathematics achievement [27].
The ability of self-efficacy to promote mathematics achievement suggests the importance of ensuring high self-efficacy among students particularly in problem-solving [6]- [34]. The available self-efficacy scale instruments in the literature, however, focus on mathematics in general e. g. Mathematics Self-Efficacy Scales in Nielsen and Moore [35]; mathematics and metacognition e. g. Mathematics self-efficacy and Metacognition Inventory in Jaafar and Ayub [24]; mathematics teaching e. g. Mathematics Teaching Efficacy Beliefs Instrument in Cetinkaya & Erbas [36] and Kieftenbeld, Natesan, & Eddy [37]. Hence, it is essential to develop one that is customized to self-efficacy in mathematics problem-solving.
Specifically, the instrument to be developed should be normalized among college students enrolled in Mathematics courses in the Philippine higher education institutions (HEIs). The transition of basic education to K-12 program has pushed the HEIs to update and upgrade their major courses. For degree programs offering mathematics, for instance, the complexity and relevance of the mathematics courses were advanced. Thus, HEIs should have sufficient mechanisms to facilitate students' success in mathematics amidst the challenges they would encounter in the said courses. One of the most powerful ways to ensure successful mathematics learning, according to the above-mentioned research studies [24]- [27]- [25]- [32]- [33], is focusing instructional and research efforts in enhancing students' self-efficacy in mathematics especially in problem-solving.

Research Objectives
The current research aims to develop Likert Scales that can be used to measure self-efficacy in mathematics problem-solving. Specifically, the study hopes to establish the structural validity and reliability of the scales.

Research Design
This study used exploratory factor analytic design to develop and validate Likert Scales used to measure self-efficacy in mathematics problem-solving. Factor analysis is a multivariate statistical process usually used in fields like psychology and education for the development, evaluation, and refinement of tests, scales, and measures [38]- [39]. Exploratory factor analysis (EFA), as an approach of factor analysis, is a data reduction technique used to explore the main factors and generate a theory from a set of latent constructs [38]- [39]- [40]- [41]. In the current study, EFA was used to reduce the 80 factors of mathematics problem-solving efficacy to a statistically manageable and interpretable number, to assess the structure of the factors, and to establish the construct validity and the reliability of the scales.

Respondents
The study was conducted in higher education institutions (HEIs) situated in the province of Isabela, Philippines particularly among first-year students taking up teaching profession and currently enrolled in Mathematics in the Modern World subject during the first semester of the school year 2019-2020. Cluster sampling was done to select four HEIs. All the students who were pursuing teaching profession from these four HEIs were selected as respondents. However, only a total of 300 first-year students completed and returned the questionnaires. These were the 95 Bachelor of Elementary Education (BEEd) students of the first HEI, 31 BEEd students and 34 Bachelor of Secondary Education (BSEd) students of the second HEI, 50 BSEd students of the third; while 90 BSEd students of the fourth HEI.

Scales Preparation and Pilot Testing
A literature review was conducted to determine the factors of self-efficacy. The following major sources of efficacy were identified: mastery experiences, vicarious experiences, social persuasion, and physiological feedback. The researchers wrote 80 items that are tailored to self-efficacy in mathematics problem-solving. These 80 latent variables were constructed in the form of 5-point Likert scales whose points ranged from 1 (Strongly Disagree) to 5 (Strongly Agree). The content of each variable was closely evaluated against the construct to ensure that they really measure self-efficacy in mathematics problem-solving. Some of the variables included were (item 7) "I am certain I can solve Mathematics problems because I was able to master the skills in problem-solving", (item 22) "Seeing someone successfully solve a problem in Mathematics affects my eagerness to do the same", (item 75) "I get a feeling of discomfort while solving a problem in Mathematics", and (item 26) "The works of others really affect my eagerness to solve Math problems successfully". Moreover, to check for clarity of the items and the directions, the Likert scales instrument was pilot tested with a group of 30 first-year students enrolled in BSEd program in one of the four HEIs. Students were not given the oral directions but they were encouraged to ask for clarifications so that we could fairly assess which part of the instrument would be revised. During the testing, however, nobody asked questions but they affirmed that the instrument was clear and understandable. Considering the results, the same questionnaires were reproduced and then administered to the target respondents to gather the research data.

Data Gathering
Though the current study was recognized and funded by the university where the research was conducted, we still asked consent from authorities and the target respondents to ascertain that ethical aspects of the research were considered. Only the students who were willing to participate served as the respondents. We informed them about the nature of their participation and ensured them of data confidentiality for both the data gathered and future publication of the study. We gave the respondents 15 to 20 minutes to respond to the 80-item questionnaire. To ascertain that they understood what to do and what they were responding to, we encouraged them to ask questions and clarifications. Some students asked us to translate item 80 "I fidget while solving problems in Mathematics" since they did not understand the word "fidget". We then translated the word into Tagalog and associate it with a feeling of nervousness. This was the lone item questioned by the students when we administered the instrument.

Data Analysis
The data were tallied, coded, and analyzed using Statistical Package for Social Sciences (SPSS). Preliminary analyses were conducted to check for missing data and to screen the data set. Negatively stated items (items 61 to 80) were recoded using the SPSS transform variable command (transformed into different variables: 5 = 1; 4 = 2; 3 = 3; 2 = 4; 1 = 5). Correlation Matrix, Bartlett's tests, and KMO measures of sampling adequacy were used to check the factorability of the data. Parallel analyses using MontecarloPA (where No. of Variables = 80; No. of Subjects = 300; and No. of Replications = 100) were done to accurately determine the number of factors to retain for succeeding factor analyses. Component Correlation Matrix was requested using Direct Oblimin rotation to check for the correlation between the components retained and identify the appropriate rotation to be used (whether oblique or orthogonal). A series of EFA (using the right rotation) were performed to establish the construct validity of the scales; while Cronbach reliability analysis was done to measure the extent to which the scales measure the same underlying construct.

Results and Discussion
There have been debates in the literature regarding the minimum sample size for factor analysis. Tabachnick and Fidell [41] argue that a sample size of at least 300 is good for factor analysis. Pallant [40], on the other hand, suggests at least 150 samples; while recently published research claims that the number of samples for EFA should be 100 or more [42]. Clearly, the sample size of 300 in the current study passed the sample size requirement.
Correlation Matrix, Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, and Bartlett's test of sphericity were requested to detect the factorability of the data set. The correlation matrix showed correlations ranging from .00 to .50. The KMO value was .890 while Bartlett's test result of 12624 was statistically significant at .01 (refer to Table 1). Research asserts that the correlation between some of the factors should be at least .30 [40]; the KMO value should be at least .60 [41]; and the Bartlett's test result should be significant at .05 [40]- [42]. This implies that the current data is factorable. Pallant [40] affirms that among the three techniques (Kaiser criterion, Scree plot, and Parallel Analysis) used to determine the number of factors to retain, Parallel Analysis is the most accurate thereby required to be reported. Hence, a MonteCarlo for parallel analysis was used to fairly determine the number of component factors to retain [40]. The number of variables entered was 80, the number of subjects was 80, and the number of replications was set at 100. The actual eigenvalues and random eigenvalues of the components were then compared. A component factor was retained when its eigenvalue exceeded its corresponding random eigenvalue [38]. The results, given in Table 2, showed four (4) components to retain. To determine the right rotation to use, EFA (retaining four components) with a Direct Oblimin rotation was requested. Several authors recommend this step to check the correlation between the components retained which would serve as a fair basis in identifying whether the construct can be best interpreted using an orthogonal rotation or an oblique rotation [40]- [41]- [43]- [44]- [45]. Tabachnick and Fidell [41] suggest that when correlations, regardless of directions, are less than .32, the use of orthogonal rotation is suggested. On the other hand, correlations greater than or equal to .32 warrant an oblique rotation. This implies that the low correlations between the components in the current study (as shown in Table 3) permit the use of Varimax (orthogonal) rotation. According to Pallant [40], a factor loading of .40 is quite strong. Hence, EFA was conducted with suppression of variables whose factor loading is less than .40. Results retained 32 variables for Component 1 with factor loadings from .416 to .790; 20 variables for Component 2 with factor loadings from .535 to .804; 9 variables for Component 3 with factor loadings from .409 to .556; and 7 variables for Component 4 with factor loadings from .521 to .742.
Moreover, research suggests that the ratio of the number of respondents to the number of variables is also a necessary assumption for EFA. Tabachnick and Fidell [41] suggest a ratio of 5:1 while Nunnally [46] recommends 10:1. Comrey and Lee [47] propose that each variable should have at least 5 to 10 respondents. On the other hand, Rahn [48] recommends that the number of variables per component should be at least 3 with high factor loadings (usually at least .40).
Considering these suggestions, we reduced the number of variables into 30 to meet the ideal ratio of 10 (300 respondents) to 1 (30 variables) and we maintained at least 5 to 10 variables with the highest factor loadings per component. These factor loadings are shown in Table 4.

Analysis Stage (with 30 variables retained)
The reduction of the variables suggests re-assessment of the previous results. Thus, we re-examined the factorability of the data. As can be gleaned on Table 5, the data set with KMO (= .901) greater than .60 and Bartlett's test result (= 4464) significant at .05, attained factorability [40]- [41]- [43]. Parallel analysis was re-run to check the number of components to retain ( Table 6). The results still suggest a four-factor solution. The PCA (4 components) with Direct Oblimin rotation showed that three of the four components are independent with one another (r < .32), which affirmed the appropriate use of an orthogonal rotation. A factor analysis with Varimax rotation then was requested. The results confirmed the previous results that 10 variables remained in each for Component 1 and Component 2; while 5 variables each for Component 3 and Component 4 (refer to Table 8). The factor loadings among the items in each component ranged from .713 to .811, .720 to .813, .659 to .804, and .476 to .748, respectively.
Finally, the last step conducted for EFA was to analyze the theoretical meaning of the components. Research asserts that the meaning of the components depends on how the researcher defines them [49] but it should be labeled properly based on their theoretical meaning reflected by the items [38]. Hence, we interpreted the components as follows: (1) Component 1 measures the extent to which the verbal persuasions the students received from other people increase their self-esteem; (2) Component 2 assesses the students' feeling of anxiety, fatigue, stress, and discomfort when solving a problem in mathematics; (3) Component 3 appraises the extent to which students' self-esteem is affected by the successful works of other people; while (4) Component 4 gauges the extent in which students' mastery of mathematical concepts influences their confidence in solving mathematics problems. Thus, the components (scales) were descriptively labeled as Social Persuasions (Component 1), Physiological/Somatic Response (Component 2), Vicarious Experiences (Component 3), and Mastery Experiences (Component 4), respectively. The four-factor solution explained a total of 58.19% of the variance, with the four components contributing 23.65%, 20.0%, 9.78%, and 4.77% of the variance, respectively (refer to Table 9). Hair et al. [42] explained that information in social sciences is often less precise thus a solution that accounts for 60 percent of the total variance or less is considered satisfactory. Similarly, Merenda [50] asserts that the rule of the thumb is to have a variance accounted for at least .50. Hence, the four-factor solution with cumulative variance explained of 58.19% can be considered satisfactory.
Moreover, Cronbach Alpha Reliability analysis was conducted to estimate the degree to which the instrument can measure the same underlying constructs. Based on the results, the reliability coefficients of the four scale components ranged from .717 to .923. According to Pallant [40], values above .70 are acceptable. This implies that all the scales in this study are reliable.

Conclusions
The study discusses the development of an instrument used to measure problem-solving efficacy in mathematics. The development was done rigorously, ensuring that all the requirements of EFA (e.g. sample size, the ratio of respondents to variables, Correlation Matrix, KMO measure, and Bartlett's test) were met and all the steps were strictly followed. The results showed four factors of problem-solving efficacy; namely, mastery experiences, social persuasions, physiological/somatic responses, and vicarious experiences. This supports the theory of Bandura [1]- [6] and the finding of Lent, Lopez, Brown, and Gore [51] that a four-factor model of self-efficacy has a good fit for college students.
The current study showed that social persuasions (10 items; R 2 = 23.64%) and physiological/somatic responses (10 items; R 2 = 20%) explain a larger amount of variance on students' efficacy in solving problems in mathematics than vicarious experiences (5 items; R 2 = 9.78%) and mastery experiences (5 items; R 2 = 4.77%). The result is not consistent with previous findings that the strongest predictors of self-efficacy are mastery experience and vicarious experience [1]- [10] and the weakest is physiological/somatic responses [1]- [8]. Meaning, in the context of the student respondents, their self-efficacy in mathematics problem-solving depends more on the positive feedback they received from teachers and peers, and their negative responses toward mathematics problem-solving. This finding, however, needs further quantitative study to detect the predictive ability of the four factors on self-efficacy.
Hence, a multiple regression analysis is suggested to verify the causal effect of mastery experience, vicarious experience, social persuasion, and somatic responses on self-efficacy in mathematics problem-solving and achievement. Moreover, the developed Problem-Solving Efficacy Scales (PSES) in Mathematics in this study attained construct validity and reliability. Hence, it could already be used to measure how self-efficacious students are in solving problems in their mathematics subjects. Future research may adopt and use the scales for quantitative studies. Moreover, the scales may be subject to further analysis like Structural Equation Modeling (SEM).