Item Analysis of the Science and Technology Components of the 2019 Basic Education Certificate Examination Conducted by National Examinations Council

One of the activities that take place behind the scenes as part of the steps taken by examination bodies to standardized tests is the determination of the psychometric properties of the items in a process referred to as item analysis. This study analysed the adequacy of the basic science and technology test items used for the 2019 Basic Education Certificate Examination in terms of difficulty levels, discriminating powers, guessing, and distracter indices. The study adopted evaluation and descriptive survey research designs. A sample of 976 students was selected from 59 schools across Niger State for the study using a multistage random sampling procedure. A 60-item Basic Science and Technology test administered by the National Examinations Council for the 2019 BECE was used to collect data for the study. The item difficulty levels, discriminating powers, guessing, and distracter indices of the items were generated based on a four-parameter Item Response Theory model using Xcalibre software package. The items adequately satisfied unidimensionality and local independence, when statistically examined using confirmatory factor analysis and tetra–chronic correlation respectively. The findings revealed that all the items satisfied the difficulty, discrimination, distracter and guessing indices criteria to be used for the 2019 BECE. It was recommended among others that school proprietors and the government should endeavour to sponsor teachers to attend item generation workshops to be up-to-date with how to generate test items that meet the criteria for final examinations.


Introduction
Basic Education Certificate Examination (BECE) occupies a prominent position in the educational life of the Nigerian child. This is not farfetched as it is the first public certification examination taken by students in Nigeria [1]. This examination, in some circles referred to as the Junior School Certificate Examination (JSCE) [1,2], is taken by students who have completed their first three years of secondary education. According to [3], junior secondary education is the education that a child receives immediately after primary education. The objectives of this level of education as spelt out in the National Policy on Education (NPE), are to: develop patriotic young people equipped to contribute to social development and the performance of their civic responsibilities; provide children with diverse basic knowledge and skills for entrepreneurship and educational advancement; inspire national consciousness and harmonious co-existence irrespective of differences in endowment, religion, colour, ethnic, and socio-economic background; inculcate values and raise morally upright individuals capable of independent thinking, and who appreciate the dignity of labour [3].
In pursuit of these objectives, the Junior Secondary Education curriculum was designed to provide children with training through the following subjects: English Studies, one Nigerian Language (one of Hausa, Igbo, Yoruba, Edo, Ibibio, and Efik), Mathematics, Religion and National Values, Basic Science and Technology, Pre-Vocational Studies, French, Cultural and Creative Arts, Business Studies, and Arabic. Among these 10 subjects, Basic Science and Technology (BST) is of interest to this study. This is because BST, as hinted by [4], is the only subject at BECE level that avails students the capacity to: Explain events in nature; develop in an all-round manner covering cognitive, affective and psychomotor skills; possess the ability of reasoning and thinking logically; evolve global skills in which scientific know-how can be utilised information gathering from nature to solve the problems of mankind, and stir instinctive inquisitiveness during experiments. This is in addition to the fact that science and technology are tools that springboard nations to economic development, self-sufficiency, and sustainability. Hence, the nation being mindful of the importance of science and technology and their contribution towards technological advancement, intentionally included BST as one of the major and compulsory subjects in the junior school system. In testing students during the BECE examination, which is a condition needed for a student to be certified to have completed junior secondary education, the National Examinations Council administers 120 multiple-choice questions (MCQs) covering four main components of BST, namely; basic science, physical and health education, basic technology and computer studies each comprising of 30 items respectively.
The National Examinations Council (NECO) was established in April 1999 by the then Head of State, General Abdulsalami Abubakar. Its core mandate includes: the general control of the conduct of the internal and external Senior Secondary School Certificate Examinations in Nigeria without prejudice to the existing powers and functions of the West African Examination Council; conducting a Standard National Assessment of Educational Performance at junior and senior secondary school levels; conducting research leading to national improvement of testing and examination procedures at junior and senior secondary school levels [5]. In effect, NECO is the only national agency saddled with the responsibility of conducting a national examination used to certify students after their junior secondary education, that is accepted worldwide. For the tests conducted by NECO to meet international standards, a lot of activities take place behind the scene. One of such activities is the determination of the psychometric properties of the tests used in a process referred to as the item analysis.
Item analysis is the process in which the validity, reliability, discrimination, difficulty, and distraction indices of individual items of a test are examined as they affect the whole test. Item analysis was defined by [6] as a procedure which provides a test developer with the opportunity to examine both test item and the answers supplied by the testees to find out whether they are of adequate quality and if the content coverage is sufficient for the test they are developed for. Item analysis is done on multiple-choice questions of a test after it has been administered [7]. The upside to the practice of item analysis is that it allows for the observation of the item characteristics and the improvement of the quality of the test. Reference [8] revealed that item analysis helps the test developer to examine each specific item to appraise their quality concerning difficulty, discrimination, and distracter indices.
Item difficulty may be viewed as a measure of the fraction of test-takers that correctly answered an item. Similarly, item discrimination can be seen as the ability of an item to distinguish between the performance of brighter testees and those in the group that may be considered as not so bright. In the same vein, [8] defined item discrimination as a measure that differentiates between the performance of students in the high score group and those in the low score group. Reference [9] explained that item difficulty index (DIF I) also commonly referred to as p-value, is a measure of the fraction of the whole test-takers that correctly responded to an item. Also, [10] stated that distracter index is a measure of how well the incorrect options in an item prevent the 'not-so-bright' test-takers from picking the right option (also called the key). Good distracters appeal to a higher proportion of low-achieving test-takers than they do to the bright ones, in that way they result in negative statistics [11].
It has been noted that on the part of teachers and test developers, carrying out item analysis has the capacity to conserve valuable time and energy in the sense that it functions as a means of identifying items that are too easy or too difficult, distinguish between testees that were exposed to the content and test-taker who were not exposed to the content or distracters that are not plausible [12]. Sources such as [13], [14] and [6], revealed that there are two widely used approaches to item analysis. They are Classical Test Theory (CTT) and Item Response Theory (IRT). Classical Test Theory uses two main statistics: the item difficulty index (the proportion of testees that responded correctly to the item) and the discrimination index (the point-performance on individual items and total test scores). IRT however, refers to two key things, both item statistics and testee's capability with the postulation about the existence of a relationship between his score on a single item and eventual overall performance in the test [6].
In NECO, IRT is used to analyse the items of the various tests she conducts. To do this, a pilot study of the items that are intended for use in future examinations is done through a process known as trial testing. This enables the examination body to take a critical look at individual items of the test with the view of making sure that only those of them that meet expected standards are used. However, there is a gap created by the lack of information from an independent observation standpoint, concerning the quality of the items used for testing candidates in Nigeria. The quality of the items used by NECO over the years have only been attested to by the reports of NECO herself and she is not under any legal obligation to make an outright declaration about the quality of the items she uses in conducting her examinations. It is therefore hardly surprising that researchers who hold the opinion that the quality of items is the cause of the unstable nature of the performance of students in key subjects like BST in BECE, assert dominance in the narrative of students' failure to meet expected success levels in public examinations in the country. This is not farfetched as researchers like [15] have asserted that there is the probability of the existence of items that could be said to possess elements of technical inadequacies in the tests developed and administered by examination bodies like NECO. Similarly, [10] are of the view that sometimes, students' failure in key subjects at BECE is because of the faults inherent in the psychometric properties of the test, not just due to their inabilities. Against this backdrop, it is therefore not out of place for an independent item analysis to be done with a view of ascertaining whether or not the items used in public examinations by NECO are of the expected quality. Hence the need for this study.
To this end, this study was undertaken to do item analysis of the BST test administered by NECO in the 2019 BECE. Specifically, the study explored; the adequacy of the difficulty and discrimination indices, level of the guessing factors, and the effectiveness of the distracters of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination. The findings of the study provided information, from an independent viewpoint, on the quality of items used by NECO in conducting her examinations.

Theoretical Framework
The theoretical framework of the study is hinged on the 4 parameters logistic model of item response theory (IRT), as developed by Barton and Lord in 1981 [16]. According to [17], the 4 parameters logistic model (4PLM) makes for a more detailed technique for testees' ability estimation than 1, 2, and 3 parameter logistic models. 4PLM estimates four parameters; discriminating power (a-parameter), difficulty level (b-parameter), guessing factor (c-parameter), and distraction factor (d-parameter). The model generally runs under four basic assumptions of monotonicity, unidimensionality, local independence, and invariance [18], but unidimensionality and local independence are the most important. Unidimentionality indicates that items in a test are developed to measure only one area of knowledge. Local independence on the other hand means that the performance of a testee will not affect his score for better or for worse in any other item.
In the 4-parameter logistic model, the probability of a correct response to a dichotomous item is given by: Where: θ: is the test taker's ability. c: is the guessing index. d: distracter index. e: is the base of the natural logarith and is approximately equal to 2.714. D: is the arbitrary constant (normally D = 1.7). a: is the item discrimination index. b: is the difficulty parameter [17].
IRT predicts test-takers' chances of providing the right response concerning their abilities and characteristics of the items. Reference [15] revealed that ability level is estimated on a scale that can easily be transformed. The scale is such that it ranges from negative infinity through a mid-point of zero to positive infinity. Nevertheless, in real life situations -3 to +3 is the range of values that ability is often limited to. It follows that items with difficulty levels of +3 and -3 are labelled as "very difficult" and "Very easy" respectively. To carry out item analysis using IRT, the items are dichotomously scored. That is, '1' allocated for correct responses and '0' for wrong responses. One of the merits IRT has over CTT is that it provides a framework for evaluating how well individual items in a test or examination function. Hence, as noted by [19], some of the relevance and advantages of using IRT in item analysis is that it enables test designers to generate test items, have a well-maintained item bank, and hold equality in terms of item difficulty for alternate forms of tests. This can allow for results obtained overtime to undergo a comparative analysis.

Objectives of the Study
The study was guided by the following objectives:

Methodology
The study adopted evaluation and descriptive survey research designs. According to [10], evaluation research design has to do with an organised process of passing value judgments based on the set goals, to ascertain whether the objectives should be maintained, modified, or improved on. Descriptive research design is the type of research in which data are gathered from a population to find out how and why they are affected by certain conditions, situations, or phenomena. The population of the study comprised all junior secondary school students in Niger State who were preparing to sit for BECE either that of Niger State Ministry of Education or that of NECO.
A sample of 976 students was selected from 59 schools across the state for the study using a multistage random sampling procedure. The instrument used for the study was a 60-item multiple choice test consisting of the science and technology components of the Basic Science and Technology test administered by NECO in the 2019 BECE. The breakdown of the 60 items of the instrument showing the nine specific subject areas covered is presented in table 1. Each item of the instrument had five options lettered A -E. The testees were expected to answer each question within one minute, hence the time allotted was one hour. The instrument was not subjected to moderation and validation. This was because NECO being an internationally recognized and organized examination agency develops, moderates, validates, and standardizes her items before using them for the examinations she conducts. Hence, the items of the instrument were assumed to be of adequate quality. Copies of the instrument were administered to the sampled students under similar conditions as given by the NECO. The data collected were analysed using Xcalibre statistical analysis software package. The item difficulty levels, discriminating powers, guessing, and distracter indices of the items were generated based on a four-parameter IRT model from the software package. Two important conditions; unidimensionality and local independence were statistically examined. Confirmatory factor analysis was used to test for unidimensionality by examining all the items used in the study to determine whether or not there is a dormant factor among them. Similarly, tetra-chronic correlation was used to check the local independence between items with the same ability. All the items were found to have met the conditions for unidimensionality and local independence.

Results
After the data were collected and analysed, the results obtained from the four research questions were summarised and presented in tables 2 -5 respectively.

Research Question One
How adequate are the difficulty indices of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination?

Research Question Two
How adequate are the discrimination indices of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination? The results show that four items representing 6.67% had slightly low discrimination with their discrimination indices falling below 0.4. In the same vein, 33.33% (20 items) of the items discriminated adequately with their discrimination indices falling between 0.40 and 0.49. Also, 28 (46.66%) items had high discrimination powers based on having indices that ranged between 0.50 and 0.59. Similarly, 13.33%b (8) of the items had discrimination indices that were between 0.60 and 0.69. This means that their discrimination was very high. Essentially, all the 60 items effectively discriminated between bright candidates and the not so bright ones.

Research Question Three
How high are the guessing factors of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination? The summary of the guessing factors of the basic science and technology test items used for the 2019 Basic Education Certificate Examination is shown in table 4. The results show that 1.66% of the items had a guessing factor less than 0.10, making the probability of guessing the correct response to it extremely low. In the same vein, the results showed that the probability of guessing the correct responses to 22 (36.67%) of the items which had guessing factors ranging between 0.10 and 0.15 was very low. Similarly, the correct response to 37 (61.67%) of the items cannot easily be guessed because their guessing factors were between 0.16 and 0.20. In a nutshell, since none of the 60 items had guessing factor greater than 0.26, they are all good items.

Research Question Four
How effective are the distracters of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination?  Table 5 shows the summary of the distracter indices of the basic science and technology test items used for the 2019 Basic Education Certificate Examination. The result revealed that only one item with a distracter index greater than 0.05 which means that the alternatives did not sufficiently appeal to low achievers among the testees. However, the remaining 59 (98.33%) items had distracter indices of 0.05 or less. This means that the alternatives or distracters of the 59 items appealed sufficiently to low achievers than they do to high achievers. In essence, the distracters of the items were effective.

Summary of Findings
The results of the analysis produced the following findings:

Discussion of Findings
The information in table 2 provided the answer to research question one. Findings revealed that the difficulty indices of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination are adequate. This implies that although NECO does not disclose the psychometric properties of the items she uses for the tests she conducts, it does not in any way suggest that the quality of the items used may be of low quality. These findings are consistent with the findings earlier made by [15] who revealed that the items used by NECO are of adequate quality for use in the tests she conducts since their psychometric properties; difficulty, discrimination, distracter indices as well as guessing factors were adequate. Also lending a voice to the adequacy of the NECO's items, [20] and [21] reported that the difficulty parameter of test items constructed by public examining bodies in Nigeria, NECO inclusive, were adequate. This means that even though there is room for improvement, the items constructed by NECO are basically of the right standard for the candidates that sat for the examination.
On a similar note, the information in table 3 provided the answer to research question two. The findings revealed that the discrimination indices of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination are adequate. This implies that NECO as an examination body spares no resources in ensuring that standards are met and maintained in her test items. Similar findings were made by [13] and [21] who reported that the difficulty parameter of test items constructed by public examining bodies in Nigeria, NECO inclusive, were adequate. Hence, accepting the adequacy of the quality of the items, it can be concluded that the quality of the items NECO uses for BECE helps in ensuring that quality and credible grades are awarded to test takers.
The answer to research question three was provided by the information in table 4. The finding revealed that the guessing factors of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination are adequate. Once again this is indicative of the fact that the items used by NECO adequately screen higher achievers from low achievers. The findings of [22] in a study designed to assess the items prone to guessing in SSCE economics multiple-choice tests among students in Kwara State, Nigeria, lend a voice to this when it reported that the items used by NECO are of adequate quality for use in the tests she conducts. [15] also supports this finding when they reported that the items used by NECO are of adequate quality for use in the tests she conducts since their psychometric properties; difficulty, discrimination, distracter indices as well as guessing factors were adequate. Guessing has a way of negatively impacting on the credibility of test scores. This is because low achieving test-takers could sometimes score a set of items correctly by merely guessing their responses [23]. A good test is normally designed to reduce guessing to the barest minimum just like NECO appeared to have done in the test used for this study.
The data in table 5 provided the answer to research question four. The finding revealed that the distracters of the Basic Science and Technology test items used for the 2019 Basic Education Certificate Examination were effective. This result is not farfetched as researchers like [15], [20] and [21] who previously studied the psychometric properties of test constructed by the examination bodies in Nigeria, had reported that most of the items used by NECO for instance, had distracters that functioned effectively. This has some implications in education. Seeing that the educational system is strengthened by the quality of testing. Hence, much still needs to be done in the area of improving candidates' performance in key subjects like science and technology. Hence, the abysmal performance sometimes recorded among candidates in public examinations is not in the quality of tests as insinuated by [10]. Perhaps, other factors like students attitude, low morale on the part of teachers, teacher efficacy [24] and the poor state of school infrastructure may have to be looked into.

Conclusions
Based on the findings of the study, it was concluded that the basic science and technology items constructed by NECO for the Basic Education Certificate Examination have adequate difficulty, discrimination, distracter and guessing indices. Their standards are adequate for use in a national examination like the Basic Education Certificate Examination. The unstable nature of the performance of candidates in the examination could be due to some other factors such as students' attitude, low morale on the part of teachers and the poor state of school infrastructure.

Recommendations
The following recommendations emerged: 1. School teachers at all levels of the secondary school should imbibe the strategy of incorporating national examination bodies' like NECO's past questions into internal evaluation processes to familiarize their students with the standards they will be facing in their final examination. 2. School proprietors and the government should endeavour to expose teachers to marking