Design and Initial Validation of the Scales Epistemology, Methodology and Gender in Taught History (EMG), and Women in History (WH) for the Evaluation of Gender Competence of Social Science Teachers in Training

In recent decades, it has been frequent to apply instruments and built ad hoc in studies oriented to the diagnostic evaluation of the inclusion of the gender perspective in the teaching of history and the degree of acquisition of gender competence in the initial teaching training. However, less usual has been the work dedicated to the analysis and validation of the metric properties of these instruments, with the specific purpose of guaranteeing the validity of the conclusions reached from the data obtained. In order to respond to this absence, the present study describes the initial validation of the scales Epistemology, methodology and gender in taught history (EMG) and Women in history (WH) in a sample of primary and secondary school trainee teachers from four Spanish public universities. With this aim, the reliability of each scale, the validity of its content and the construct validity are studied by means of an exploratory factorial analysis of principal components and varimax rotation. The results obtained inform about the general viability of both scales to be applied for the evaluation of the gender competence of trainee teachers, and the analysis of their social representations on the place of women in the taught history. The study shows the functional interdependence between scales and robustness for its joint application in the general assessment of gender as a category of analysis in social science teaching.


Introduction
The inclusion of the gender perspective and its strategic mainstreaming for equality (gender mainstreaming) in the teaching of history is one of the most recognized teaching and research concerns in the scientific field of social science teaching. This perspective, understood as a conceptual and methodological framework for the analysis of the causes and consequences of biases, inequalities and social differentiation based on sex, considers stereotyping, prejudice and the allocation of social roles in the deconstruction of "desirable gender models" [1], which are determining factors in the hegemonic duality of identity, and are directed towards the recognition of social plurality as a principle for the formation of multiple and diverse gender identities [2].
The incorporation of gender as a category of analysis in the teaching of the social sciences [3,4] makes it possible to recognize the operability of gender stereotypes in social relations and patterns of behaviour, channelled into discourses and teaching practices. In this sense, the school is presented as a priority center for the transmission of codes, imaginaries, and social rules that determine the ways of thinking and of being thought of in the society in which one lives. The educational inclusion of this perspective moves away from the traditional heteronormity of sex and gender, and from the socializing reproduction of social and cultural references that seek to explain a certain society, in order to inform in order to form, train in order to raise awareness and commit to the eradication of gender inequalities [5].
The uncritical maintenance of the androcentric bases in the history taught consolidates the invisibility of women as social agents, and the unequal hegemony of dichotomous social and identity values and models [6]. These deterministic bases of the social prevent an integral and plural understanding of historical societies and of the way in which social knowledge and its relativity are constructed, and Women in History (WH) for the Evaluation of Gender Competence of Social Science Teachers in Training which is often absent in the textual and iconographic narrative discourses of textbooks and teaching materials [5,7,8].
There is little very recent research focused on the study of social representations and the effectiveness of teacher training programmes on the inclusion of gender as a category of analysis in the teaching of history, both in the field of teacher training [6, [9][10][11] and of in-service teachers [12]. From the principles of education for global citizenship, the inclusion of this socially and historically determined concept should promote the visibility of women, and avoid the reproduction of androcentric structures in the training of history and social science teachers, which is still weak in programmes and teaching practices [13].
Despite the activity of institutional commissions, units or observatories, and the efforts to incorporate the gender perspective in the various stages of education [6, [14][15][16], the resistance of the university institution to incorporating specific subjects designed from a gender perspective into its curricula seems to be confirmed. These limitations result in the professionalization of students without the necessary theoretical and practical instruments in the area of gender to develop their own programs [9,17].
In this context, this research aims to describe the first phase of validation of the scales Epistemology, methodology and gender in taught history (EMG), and Women in history (WH) and to analyze their interdependence. Both scales are aimed at the analysis of the social representations of teachers in training on the relevance of gender mainstreaming in the teaching and learning of history, and the consideration of the concept of gender as a category of analysis in the teaching of social sciences. Understanding the age and the hours of training in gender, coeducation and education for equality received by students as conditioning factors of their social representations and their knowledge about women in history, in a complementary way, the research analyzes the possible dependency relations between these study variables.

Participants
The participating sample was composed of a total of 22 women (66.7%) and 11 men (33.3%), with an average age of 30.06 years (SD = 10.69), enrolled in the Bachelor's Degree in Primary Education (18.2%), in the Master's Degree in Secondary Education (78.8%), and in other official studies (3%), from four Spanish public universities and other institutions: University of Burgos (75.8%), University of Murcia (15.2%), University of Valladolid (3%), University of Alicante (3%) and other institutions (3%). The sampling was non-probabilistic for convenience [18], according to intentional criteria, depending on the level of adequacy of the sample to the research objectives.

Instruments
The first scale (EMG), constructed ad hoc, is composed of 21 items and five ordinal measurement response alternatives from 1 to 5, where 1 expresses total disagreement with the proposed statement and 5 the maximum degree of agreement. Its design contemplates four dimensions of analysis: Dimension 1 (D1 / 4 items). Inclusion of the gender perspective and women's social experience in the teaching of history. Dimension 2 (D2 / 5 items). Description and analysis of female social roles and social visibility of women in the teaching of history. Dimension 3 (D3 / 6 items) Scientific methodology and historical sources in the construction of historical knowledge. Dimension 4 (D4 / 6 items). Curricular problematization of the teaching of history, social transformation and development of social thinking skills ( Table 1).
The second scale (WH) is constructed from 15 brief descriptive definitions of women's history and 20 possible responses attributable to each statement. This scale asked the participating students to connect a feminine proper name with an event or contribution to the history of humanity. Its design contemplates three dimensions of analysis: Dimension 1 (D1 / 4 items). Political leadership. Dimension 2 (D2 / 7 items) Political-cultural leadership. Dimension 3 (D3 / 4 items). Cultural leadership ( Table 2). The historical narrative is the product of the critical application of a systematic process of research.

D2
The integration of women's social roles into the historical narrative is essential.

D3
Knowledge of the historical method is fundamental for historians.

D3
History is the reasonable reconstruction of past events on the comparative basis of available evidence.

D2
The participation of women in relevant historical events is anecdotal.

D3
It's absurd for students to study the historical method.

D4
The teaching of history should focus on the events and/or social problems of the past.

D4
History constitutes an objective and contrasting narrative set of dates, facts, events and social actors.

D2
One of the responsibilities of history is to make all social and historical actors visible.

D2
The description of women's social roles must be added as specific content to the political-institutional narrative or story.

D3
History is the interpretative reconstruction of past events, based on sometimes contradictory evidence.

D4
It is unacceptable to address current social issues in history classes.

D4
Training for the exercise of participatory democracy should be integrated into the curriculum of the history.

D2
The explanation for the scarce presence of women in history lies in their great invisibility in the available historical sources.

D3
Any historical narrative or story is useful to approach the past.

D1
The study of gender relations constitutes a valid category for the analysis and understanding of historical societies.

D4
Knowledge of history is essential for social transformation.

D1
Women's historical experience lacks any scientific interest.

D1
Political and military leaders, mostly men, are the true builders of national history.

D4
The didactic treatment of social problems favours the development of social and historical thinking skills in students.

D1
Addressing gender inequality in the teaching of history is not necessary.

Design and Procedure
The research is ascribed to non-experimental research designs of a transversal and exploratory nature [18,20], as its purpose is to begin to understand a set of variables (scales) through its initial exploration at a specific moment. In order to define the concepts that would specify each of the study variables, the construction of the scales was based on the review of the scientific literature and the available empirical background, in particular the studies of Ortega-Sánchez [6], and Ortega-Sánchez and Pagès [11] for the EMG scale. The WH scale consisted of the literal translation into Spanish of the second part of the knowledge survey, applied in Crocco's study [19].
Once the EMG scale was designed, the researchers redefined, in a second research stage, those items that could affect the accuracy of the ratings for containing more than one content or aspect of analysis in the same issue, and specified the dimensions or factors underlying each scale.
Finally, a pilot test was applied to a group of students (n = 33), enrolled in four Spanish public universities, which share characteristics with the population with which the researchers are expect to work. The objective of this application was to test the measurement instruments, through the operation and evaluation of its construction, writing and understanding of each item by the participants.

Data Analysis
In order to determine its accuracy, stability and consistency, the evaluation of the EMG and WH scales is carried out, as a priority, through the application of measures of reliability and internal consistency of the scales (Cronbach's alpha and McDonald's omega) [21], and the splint-half method, in order to check the degree of correlation between halves of the two scales. Likewise, the linear correlation between each item and the total score of the two scales is measured, for the purpose of identifying those issues that can be reconsidered or eliminated.
In order to explain the theoretical-empirical model of the constructs of the two scales, its validity is studied by means of exploratory factor analysis (EFA) by principal components and varimax rotation with Kaiser normalization, omitting loads lower than . 4. Likewise, with the objective of analyzing the agreement between attributes of the variables contemplated in the EMG scale, we calculated the Kendall's coefficient of concordance [22], from the evaluations expressed by 3 independent experts' judges in Didactics of History in a range of 1 to 4 points. For the evaluation of these attributes, the instrument validation format contained in Corral's study [23] is adapted. It consists of four general criteria (C g ), applicable to the whole scale, and five specific criteria (C e ), applicable to each item: C g1 . The instrument contains clear and precise instructions for answering the questionnaire; C g2 . The items allow the objective of the research to be achieved; C g3 . The items are distributed in a logical and sequential way; C g4 . The number of items is sufficient to collect the necessary information; C e1 . Clarity in the wording; C e2 . Internal coherence; C e3 . Response induction (bias); C e4 . Language appropriate to the level of the informant; C e5 . It measures what it claims to measure.
Finally, in order to analyse the possible dependency relations between the age of the participants and the number of hours of training received on gender, coeducation and education for equality, the degrees of interdependence between variables and scales are analysed.
For data processing and analysis, we use the SPSS v.24 statistical package. In the process of collecting and handling data, participants were guaranteed anonymity of their responses and their subsequent processing.

Analysis of Reliability and Internal Consistency of the EMG Scale
The reliability analysis of the EMG scale returns a level that is both questionable (α = .68) and adequate (ω = . 77) for the set of variables. The unequal length Spearman-Brown gives a moderate degree of positive correlation between parts of r = .432. The reliability of the scale is therefore cautiously confirmed.
The linear item-total correlation coefficients report the existence of correlations greater than .30 in 9 of the 21 items (r ≥ .39), and correlations lower than this value in 12 items (r ≤ .27). These results indicate the need to review the operation and/or approach of items 1, 2, 4, 5, 7, 8, 9, 10, 11, 14, 15, and 16. In this sense, the elimination of two of these last items (4 and 7) would favour a significant improvement in the reliability of the scale (α = .74; ω = . 81) ( Table 3).

Content Validity
The results of the validity analysis show the concordance and homogeneity among the answers expressed by the expert judges. The homogeneity of the responses in the assessments of each attribute is confirmed, in fact, with a high index of significant concordance (w t ) between the ranges assigned by the expert judges (Table 4). These values are based, for the most part, on satisfactory scores of 3 and 4 points for both the general criteria ( Me ≥ 3, SD = .000) and the specific criteria contained in the validation instrument. However, the unfavourable assessments of the specific criteria 1 (clarity of wording), 2 (internal consistency) and 5 (accuracy) in the items 4 and 7 are noteworthy, as confirmed by the reliability analysis (Table 5). and Women in History (WH) for the Evaluation of Gender Competence of Social Science Teachers in Training

Construct Validity
In order to know the possibilities of factorization, we apply, prior to the analysis of the construct validity, the Bartlett sphericity test and we calculate the sample adequacy measurement index of Kaiser, Meyer and Olkin (KMO). The results obtained in the sphericity test (χ 2 = 365,103; gl. = 210, p = .000) indicate an optimal fit of the model, while the value reached in the KMO test (.457) casts doubt on the suitability of the matrix to be factored. However, the analysis returns a factorial solution composed of six factors very close to the dimensions contemplated in the scale design, which help to explain its coherence and latent structure. These factors explain 73.48% of the total variance, a level considered satisfactory ( Table  6).
The first factor (6 items) measures the representations related to the inclusion of gender inequalities and the integration of women's social experience in the history teaching, and the valuation of male leadership in the construction of national historical narratives. It also includes the function of knowledge of history in the processes of social transformation, the assessment of the curricular inclusion of social problems and its usefulness in the development of social thinking skills. The second factor (5 items) incorporates the need to integrate women's social roles in the traditional historical narrative, the visibility of women as social and historical agents, training for the exercise of a plural and participatory democracy, and the validity of available narratives for approaching the historical past.
The third factor obtained (3 items) informs about the scientific nature of the historical discipline, the knowledge of the historical method and the management of sources for the construction of social knowledge. Factor 4 (3 items) includes aspects related to the pseudo-objectivity of the historical discipline, with its unidirectional focus on the social events and/or problems of the past, and with its interpretative character on possibly contradictory evidence.
Factor 5 (2 items) includes the variables related to women's participation in the construction of the historical narrative and to their supposed invisibility in the available sources.
Finally, factor 6 (2 items) includes the last two variables on the incorporation of the historical method in the history teaching and learning, and of gender relations as a valid category for the analysis and understanding of historical societies.
The elimination of items 4 and 7, in spite of improving the scale factorization indicators (.472; χ 2 = 312,290; gl. = 171, p = .000), the resulting factors (74.75% of the total variance explained) do not return significant changes in its latent structure. It is therefore decided to rethink it and not to eliminate it (Table 7).

Analysis of Reliability and Internal Consistency of the WH Scale
The reliability analysis of the WH scale returns a satisfactory level for the set of variables (α = .83; ω = . 85). Similarly, the unequal length Spearman-Brown shows a high degree of positive correlation between parts of r = . 705, confirming the reliability of the scale. The linear item-total correlation coefficients report the existence of correlations greater than .30 in 13 of the 15 items (r ≥ .344), and of correlations lower than this value in 2 items (r = . 016; r = -.068). Although these results indicate the convenience of reviewing the operability of items 4 and 7, it was decided to maintain it in the construction of the scale, since its elimination would not lead to a substantial improvement in the level of reliability of the instrument (α = .86; ω = . 91) ( Table 8).

Validity Analysis of the WH Scale
After applying Bartlett sphericity test and calculating the sample adequacy measurement index KMO, the results obtained show the adequacy of the model (χ 2 = 239,321; gl. = 105, p = .000) and the relevance of the matrix to be factored (.667). The EFA returns a factorial solution composed of five factors that approximate the dimensions contemplated in the scale design. These factors explain 75.13% of the total variance (Table 9).

Correlations between Scales and Variables
An adequate interdependence between the EMG and WH scales is evident for its joint application, reaching a positive degree of moderate correlation between the WH scale and the original EMG scale (ρ = .447, p = . 015), and between the WH scale and the revised EMG scale (ρ = .466, p = . 011) ( Table 10). A moderate degree of correlation is also confirmed between the age of the participants and the number of hours of gender training received (ρ = .569, p = . 017), and between the age variable and the WH knowledge scale (ρ = .565, p = . 001). On the contrary, this same age variable does not seem to influence the values expressed in the EMG scale (ρ = .352, p = .057; ρ = .325, p = .080).

Discussion and Conclusions
The results obtained confirm the viability of the EMG and WH scales for the study of the social representations of future teachers on the inclusion of the gender perspective and the visibility of women in the history classroom. Likewise, its functionality for its joint application is evidenced and the hours of gender training received are identified as conditioning factors in the construction of the social representations of the students. The age variable, on the other hand, seems to influence students' knowledge of women's history.
The inclusion of the gender perspective and the educational treatment of equality between women and men requires teacher training capable of promoting conscious and committed positions to be active agents in the transformation of social patterns [24]. It is therefore necessary to treat critical-reflective skills and, in particular, gender skills in teaching programs [9]. From this perspective, approaching the principles and sources with which the social representations of future teachers are constructed involves analyzing the place that students give to women, their historical experience and their contributions to the construction of historical knowledge. These programs must offer future teachers the skills necessary to make past and present social reality more complex, analyze the different constructions of gender in time and space, deal with situations with equity [25], and address it as a system constituted and built by men and women with functions and possibilities for social and personal development that are not necessarily identical, intimately linked to a previous historical trajectory and to the very dynamics of gender relations [1].
As evidenced by the studies of Díez [25] and Ortega-Sánchez, Carcedo de Andrés, and Blanco [26], there are few end-of-degree/master projects in the field of social sciences that include the gender perspective in its end-purposes. If a gender discourse is recognized, it continues to appear linked to a problem that is linked to women, which evidences unequal positions [25]. This is also a reality in the educational projects of the social sciences [27].
In this line, the results obtained in the research of Bartual, Carbonell, Carreras, Colomé, and Turmo [28], in the area of history and economics teaching, show that the specific treatment of content on gender in higher education favors the positioning and awareness of students about inequality and social justice. It continues to be necessary, therefore, the educational transposition of the historical research, and to incorporate the bases of new historiographical currents, criticism and reflection in historical education and in the contexts of teacher training [29].
According to Crocco's study [19], more than half of a sample of 60 new and experienced social science teachers showed that they knew less than 50% of the names of the women proposed in the questionnaire and of their contributions to world history. The explanation for these results could lie in the recognition of the tensions between the aspiration to identify and achieve a history of women and gender as a specific area of research, and the aspiration that women and gender be treated as themes/categories of analysis that are constitutive of all historical research [30] and of social science teacher training plans.
Recent research results confirm the permanence of hegemonic, non-inclusive curricular approaches and the invisibility of women in the history taught [16,[31][32][33][34]. This reality drives the urgency of specific and transversal teacher training on gender, capable of overcoming an androcentric approach still recognizable in teachers' teaching practices, in their decision-making on content, in the definition of interpersonal relations and in the way tasks are assigned in the school context. In this sense, the EMG and WH scales for diagnostic assessment offered are intended to serve as scientifically rigorous instruments for analysis and training intervention on knowledge about the place of women in the taught history, the way in which gender inequality is recognized as a social problem by students and for the mobilization of their social representations. This evaluation should therefore be used to reflect on university training programs, and to promote a gender perspective and global and inclusive citizenship in the teaching of history and social sciences at all levels of education [35].
Despite the legislative and normative-curricular advances of recent decades in Spain, education in gender equality and the implementation of teaching programs in and for equality continue to require initial training in history and social sciences committed to social justice [31]. This commitment would also have to be defined within the framework of education for democratic citizenship capable of revealing the provisional nature and constructive mechanisms of social knowledge [5,6,32]. In this sense, investigating the social representations of teachers in training, using instruments such as those studied in this research, is the first step towards reflecting on university curricula and innovating, reflecting on and improving teaching practices, as well as adopting critical perspectives aimed at social transformation. and Women in History (WH) for the Evaluation of Gender Competence of Social Science Teachers in Training 1053-1075.