Assessment of Students' Data Literacy Skills in Southern Nigerian Universities

The data literacy skills of students in Southern Nigeria institutions of higher learning were evaluated in this study using descriptive survey method. A total of five (5) universities were selected for this study. These comprised of 150,055 students of which 2550 constituted sample for the study. Multi-stage sampling procedure was employed. The instrument used for data collection for the study was Data Literacy Questionnaire (DLQ). The reliability of DLQ was established using Cronbach-alpha method and it yielded a reliability coefficient of 0.901. The data collected was statistical analyzed using mean and standard deviation for the research questions while t-test and Analysis of variance were used to analyze the hypotheses. Findings revealed that the students possessed moderate level of data literacy skills. The students had their lowest rating in data analysis skills. The findings also revealed that Ph.D. students had better data literacy skills compared to M.Sc. and B.Sc. students, while M.Sc students' had better data skills compared to B.Sc. students. The study recommends that programs and workshops should be designed to help improve data skills for teachers and students in order to meet global standards.


Introduction
The world is becoming a more data dependent place. This allows for the description of the earth as a global village to convey a new meaning: hence communication, transport, flows and relevance of information is at the fulcrum of development and innovation in all nations. This means that an efficient use of information can transform any economy or nation while nations without this flow of information and data will witness some holdbacks and impediments in the growth and development. Therefore, data collection, sharing, analysis and use of data will foster growth (Pentland, 2013).
Innovations in science and technology have led to new development in data literacy. These innovations such as big data and Open data have brought different ways of utilizing data. The information gotten from data can be used in making judgments and understanding situations and making more informed decisions. In this knowledge-based economy where information is power, data has taken a central theme as the currency of this age. Insights gotten from data can easily help industries, groups and even nations to take decisions that can alter and create a turnover in their profit and their progress (Chinien & Boutin, 2011;Cowan, Alencar, & McGarry, 2014;Mitrovic, 2015;Ikemoto & Marsh, 2008;Mandinach, Parton, Gummer, & Anderson, 2015).
This global evolution literally translates into increase demand for data literacy as a skill among other literacies. This observation sponsors the idea for the need of a better data literate society. This need will require everyone at any level to understand and differentiate a good data presentation, visualization or interpretation from a bad one (Twidale, Blake, & Grant, 2013;Swan, Vahey, Kratcoski, van 'tHooft, Rafanan, & Stanford, 2009). Statistical literacy, digital literacy, Meta literacy, data literacy are all areas where useful insights can bring about the much needed transformation. A larger proportion of people who are competent in this various literacy did not acquire their knowledge in schools. And this raises some concerns, as some of the society's developmental needs have not been met in the schools. According to the Association of College and Research Libraries (ACRL) (2013), society has reached a critical intersection between societal/economic demand and academic demand. The world is now experiencing an influx of these skills so much so that they are available to whoever desires to gain mastery of them (Maycotte, 2014;Mitrovic, 2015). This informed the decisions of schools to take a more active role in preparing their graduates for a data-based economy (Koltay, 2014;Gunter, 2007). Employers continue to demand among other things that graduates be competent in data handling and other data operations, this has provided more investment into the training of students to be more data literate in order to position themselves as assets of the 21st century (Harris, 2012;Hu, 2012;Koltay, 2014).

Data Literacy
Data literacy is the ability to engage in critical thinking to make useful deductions from data, make sense of abstractions and put results of analysis to use (Gunter, 2007;Qin &D'Ignazio, 2010). Since data on its own does not have much meaning until the dots are connected to reveal a relationship between concepts, the ability to understand abstractions becomes very necessary in the quest towards being data literate. This is because the ability to connect the dots is like computational thinking; making sense of different independent concepts and deducing the peculiar relationship that exists between them. This is a result of deep thinking and critical evaluation; all these enhance the individual's abilities and bring about better insights from data (Wing, 2008;Davenport & Patil, 2012).
Understanding data allows insights drawn from data to be used in making predictions. Thus, data literacy programs should among other things develop the individual abilities to understand trends, solve problems and challenges that may come up in time (Johnson, Adams Becker, Estrada, & Freeman, 2015). Programs like this when designed and fully operated will cement data literacy in the society and also sponsor the development of other skills called Meta-skills (Liquete, 2012). Meta-skills are important for a better participation in the ever changing face of the 21 st century, these skills however are not foundational, and build on existing skills and framework of data literacy. Thus, a more functional data literacy program will result in a functional society (Koltay, 2015;Liquete, 2012).
In their book, Carlson & Johnston (2015) after their assessment of students' results stated that "the high level of interest in basic topics such as data formats and an introduction to databases indicate the relative lack of preparation in the core technology skills necessary to work in e-research environment." Apparently, this trend is tenable in other societies, students do not have core technology skills and so are strangers to a data driven world. Carlson & Johnston (2015) provided the core competencies for data information literacy, which were developed based on literature review. These competencies were organized under twelve (12) major themes which are introduction to databases and data formats, discovery and acquisition of data, data management and organization, data conversion and interoperability, quality assurance, metadata, data curation and reuse, cultures of practice, data preservation, data analysis, data visualization and Ethics, including citation of data. The major themes have sub-themes and specific lessons to be learnt in order to gain mastery and become competent in data information literacy.
The data literacy project of QlikTech International (QlikTech, 2020) is a global community dedicated to creating a data-literate world on their website, they offer a questionnaire that can help individuals to determine their data literacy level based on the scale provided, completing the exercise will pull up a learning pathway with target skillset, desired outcomes and learning resources. They also offer certification in data literacy (ability to read data, interpret data, work with data sets, visualizations and analysis), data analytics (data fundamentals, analytical testing, basic and intermediate statistics, hypothesis testing, data visualizations, decision making with statistics and analytics). These courses and others are available to learning with interactive video sessions for a better experience. An adoption of programs such as these into a university curriculum will better prepare students for this data driven century.

Nigeria, Data Literacy and the 21st Century
Nigeria is a developing West African nation popularly described as the giant of Africa because of its very large population. Nigeria has one of the largest youth populations and as such there is a unique opportunity to develop a functional society. However, this has not been the case as the country continues to experience some issues and this slows its growth and development. It is of extreme importance to remain at the cutting edge of Information and technological development and in the 21 st century some skills are in demand and will offer a better future with the right application.
21st Century citizens therefore must have some abilities and possess competency in handling problems such as critical thinking, understanding data and making data driven decisions (Chinien & Boutin, 2011;Wanner, 2015). Data literacy is crucial among other competencies a 21 st century citizen should have. Possessing these skills will allow not just the individual but the nation to experience improvements and a steady growth and development.
Data literacy education has been said to cause students to improve in their study habits and learning skills and since these skills are foundational and competency in it will bring development to other literacies (information, statistical, digital, media, computational, and visual literacies) called meta or trans-literacy. Studies have also revealed that students who possess trans-literacy will be more equipped to handle higher order thinking and suggest more effective solutions. This will affect the research output of such societies and allow their students to better validate, and produce research with high impact factor and high applicability (Frau-Meigs, 2012;Hattwig, Bussert, Medaille, & Burgess, 2013;Mackey & Jacobson, 2011;Vahey, et al., 2012;Zalles, 2005). Thus, in order to get this result, these skills have to be introduced into the school system and society at a time where it can yield the necessary interest in people enough to cause them to succeed.

Appropriate Timing for Data Literacy Education
Chinien and Boutin (2011) in their study suggested these skills be introduced at the post-secondary institutions. However, they also stated that the private sector must also be involved in order to ensure that the skills taught are not different from the ones in demand at the global level. This unique cooperation will bring the re-adjustment of the curriculum to allow for a flow in the process. This flow will encourage the society to become more effective and efficient, decisions will be made with insights drawn from data and this will position the society at the cutting edge of innovations (Giles, 2013;Harris, 2012;Koltay, 2014;McKendrick, 2015).
The world is changing and in order to remain relevant, one has to evolve with the world. The global evolution is gathering pace and innovations are coming at an unprecedented rate. Nigeria, a developing nation that is well positioned with an active population yet is unable to catch up on this race. Research output and data literacy is very low and this has hindered the nation to progress as it should. Literature shows that countries with higher data literacies tend to develop at a faster rate. Also, plans to improve data literacy in the population are either shelved or poorly implemented. This has crippled the growth rate of the nation. It is on the strength of this problem on which this paper assesses the data literacy skills of students in order to suggest a way to fill the gap between what skills are possessed and what skills should be possessed. The purpose of this study broadly is to assess the data skills possessed by Southern Nigerian students in Universities. Specifically, the purposes of this study are to determine: 1. what data skills are possessed by Southern Nigerian students 2. data literacy skills by gender 3. how the data literacy skills of graduate students vary by educational levels 4. data literacy skills by universities Research Questions 1. What data literacy skills are possessed by graduate students? 2. What is the difference between male and female graduate students' level of data literacy skills? 3. How do the data literacy skills of graduate students vary by educational levels? 4. Which data literacy skills do graduate students possess the highest level of acquisition?

Hypotheses
The following hypotheses were tested at 0.05 level of significance.
Ho 1 : There is no statistically significant difference between the male and female graduate students' level of data literacy skills Ho 2 : There is no statistically significant difference among graduate student's data literacy skills based on educational level. Ho 3 : There is no significant difference among graduate student's data literacy skills based on university

Research Design and Participants
This study targeted Southern Nigerian University students. The study employed survey research design with a total of five (5) universities selected for the study. These comprised of 150,055 students which formed the population for the study, of which 2550 constituted the sample that was used for the study. Multi-stage sampling procedure was employed; purposive sampling technique was used to identify the various variables of interest for the study. This implies the various universities, academic levels (Postgraduate (PG) / Undergraduate (UG)) and gender (male and female). Disproportionate stratified random sampling technique was used to select the five (5) universities (2 States` and 3 Federal Universities) and proportionate stratified random sampling was used for each academic level and gender from the population to constitute the sample for the study. The information is presented in the table below. The demographic information in Table 1 shows that in general, there are 1466 male and 1084 female participants. There were also, 1512 B.Sc. holders, 149 Postgraduate Diploma in Education (PGDE) holders, 737 Master Degree holders and 152 PhD holders. Furthermore, a total of 1608 are graduates from the University of Nigeria Nsukka (UNN), 553 are graduates from Rivers State University (RSU), 108 from Ignatius Ajuru University, 96 from University of Port-Harcourt (UniPort) and 185 are graduates from Nnamdi Azikiwe University Awka (UniZik).

Instrument Used for Data Collection
The instrument used for the collection of data was the Data Literacy Questionnaire (DLQ), a self-designed instrument. The instrument seeks information on the demographic data of the respondents and information about the respondents' opinion on the item statement. The instrument contains twenty-two (22) items. The respondents were advised to tick (√) appropriately as they wish using the four-pointLikert rating scale of Strongly Agree (SA), Agree (A), Disagree (D), and Strongly Disagree (SD). They are rated as 4points, 3points, 2points, and 1point respectively. Copies of the instrument were validated by experts in the field of Education. These specialists made some recommendations which were used to modify the instrument before the final production. In order to establish the reliability of the instrument, Cronbach Alpha method of estimating reliability was used for the study. The instruments were administered to 600 students from Northern Nigeria that were not part of the sample. The instrument after trial testing was found to have a reliability coefficient of 0.901.

Method of Data Analysis
Means and standard deviation wereused to answer the research questions. Class limit was used for decision making. Thus: 1.00-2.33=Low Data Literacy Skill (LDLS), 2.34-3.33=Moderate Data Literacy Skill (MDLS), 3.34-4.00=High Data Literacy Skill (HDLS). The hypotheses were tested using independent t-test and Analysis of Variance (ANOVA). Table 2 shows that apart from item 1 'being able to clearly decide the search terms before engaging in information gathering' where the respondents recoded a mean of 3.44 (i.e high level of literacy skill), all other items (item 2 to item 22) falls within mean range of 2.34-3.33 which is Moderate Data Literacy Skills (MDLS).The result also shows that the graduate students have the highest skills in data visualization and interpretation (Mean=3.15, SD=0.53) followed by hypotheses and problem statements (Mean=3.12, SD=0.68) and data collection (Mean=3.11, SD=0.49). Data analysis skills have the lowest mean rating of 2.77 (SD=0.59). The overall cluster mean of 3.02 shows that overall, the graduate students have moderate level of data literacy.   Table 3 shows that male graduates have a higher mean rating of 3.10 (SD=0.42) compared to their female counterparts mean of 2.92 (SD=0.46).  Table 4 presents an independent t-test result on influence of gender on graduate students' level of data literacy. The result shows that there was a statistically significant difference between the male and female graduate students on their level of data literacy t (2,2548) = 10.392, p≤0.05.  Table 5 shows that PhD students had a mean rating of 3.17 (SD=0.52), Masters' degree students had Mean rating of 3.12 (SD=0.38), PGDE students had Mean rating of 2.99 (SD=0.39) and B.Sc. students had a mean rating of 2.96 (SD=0.45). This result shows that PhD students had a higher mean rating followed by Masters' degree students then PGDE holders. The B.Sc. students had the least mean rating.  Table 6 shows that there was a statistically significant difference between educational levels t (3,2546) = 30.829, p≤0.05. To find which group differ from another, the Scheffe's post hoc test was conducted and the result is presented in Table 7.

Result in
The Scheffe Post hoc analysis in Table 7 infers that there was no difference between the B.Sc. students and PGDE Students (p=0.862), however, differences existed between B.Sc. and Masters' (p=0.000), and between B.Sc. and PhD Students (p=0.000). Also, a significant difference existed between PGDE students and Masters' degree students (p=0.008), and PGDE and PhD students (p=0.05). Comparison between Masters and PhD students shows no significant difference (p=0.736). Therefore, it could be concluded that there was no statistically significant difference between the B.Sc. students and PGDE Students, and also between the Masters students and the PhD students with respect to data literacy skills. However, differences were found between B.Sc. students and Masters' students and between the B,Sc. and PhD students' responses. Differences also existed between the PGDE and Masters students and between the PGDE and PhD students with regards to their data literacy skills.  Table 8 shows that the University of Nigeria had a mean rating of 3.02, Rivers State University had a mean rating of 3.06, Ignatius Ajuru University had a mean rating of 2.91, University of Port Harcourt had a mean rating of 3.02 and Nnamdi Azikiwe University had a mean rating of 2.99. This result shows that Rivers State University had slightly higher mean rating followed by the University of Nigeria Nsukka and University of Port Harcourt. Ignatius Ajuru University had the lowest means rating.  Table 9 shows that there was a statistically significant difference between the universities with respect to their level of data literacy F (4,2545) = 2.996, p≤0.05 . Table 10 presents the scheffe's post hoc analysis showing differences between the universities with respect to graduate students' data literacy skills. Table 10 shows that the difference is only between Rivers State University and Ignatius Ajuru University. However, there are no differences between others university graduates with respect to their data literacy skills. Hence, it could be concluded that the difference is only between Rivers State University and Ignatius Ajuru University.

Discussion
Research question one investigates what data literacy skills were possessed by the graduate students. Result in Table 2 shows that graduate students in Southern Nigeria universities had high level of data literacy with respect to being able to clearly decide the search terms before engaging in information gathering.However, they have moderate level of data literacy on all the remaining 21 item statements which included hypotheses and problem statement skills, data analysis skills, and data visualization and interpretation skills. The item with lowest mean rating is "I know the best method of data analysis for analyzing data from different sources. Considering the four clusters, it was found that the graduate possessed higher data visualization and interpretation skills as compared to data collection, statement of hypotheses, and data analysis. This is similar to results of Carlson et al. (2013). It is also similar to results of Gebre (2018)which found that students' understanding of data were limited to contexts of experiment and survey, utility and usage information, and numerical charts and graphs. In general, it was found that the graduate students had moderate level of data literacy. This result implies that although the graduate students can search for information, their skills in data analysis, and data visualization and interpretation were not very encouraging. The plausible reason for this could be poor teaching approaches by their lecturers or that it is not adequately captured in the curriculum. It could also mean that some of the lecturers themselves are not well knowledgeable or skilled enough to teach such research related topics. Study by Koltay (2014) and Boyles (2012) had established the importance of data literacy. Acknowledging this challenge, Johnston and Jeffryes (2014) suggested individualization of learning, to include courses on data literacy skills that are non-credit based. This problem of data literacy skills could also be associated to the fact that the technology and applications for data continue to evolve at a fast rate.
Research question two determined the influence of gender on graduate students' level of data literacy. It was found that male graduates had a higher mean rating compared to their female counterparts (Table 3). An independent t-test in Table 4 shows that the difference was significant. The higher mean rating of male students could be due to the fact that there are more male higher degree students (Masters' degree=500 and PhD=73) giving a total of 573, while there were 273 female graduate students (masters) and 79 PhD female graduate students, giving a total of 352. Another plausible reason could be related to culture. In Africa, women are mostly the ones taking care of the most domestic chores (Dillip, Mboma, Greer, & Lorenz, 2018). This allowed them little time to practice in order to achieved these relevant data literacy skills.
Research questions three examined the influence of educational level on graduate students' level of data literacy. Result in Table 5 shows that graduate students taking higher degree (PhD and Masters') courses had higher mean rating as compared to those taking lower degree courses such as Bachelors' degree and PGDE. The PhD students had the highest mean rating while B.Sc. students had the least mean rating. An ANOVA result in Table 6 shows a statistically significant difference among the graduate students based on educational level. The scheffe's post hoc test in Table 7 also confirmed that there were no differences between the B.Sc. students and PGDE students, also there was no difference between the Master and PhD students. However, the difference was between the lower degree students (B.Sc. and PGDE) and the higher degree students (Masters and PhD). This result is not surprising because higher degrees such as Masters and PhD are research-based degrees.So, naturally one will expect that PhD and Masters' Students should have higher data literacyskills.
Research question four examined the influence of institution of affiliation on graduate students' level of data literacy. The result in Table 8 shows that Rivers State University had slightly higher mean rating (3.06), followed by the University of Nigeria, Nsukka (M=3.02) and University of Port Harcourt (M=3.02). Nnamdi Azikiwe University had a mean rating of 2.99 while Ignatius Ajuru University had the lowest mean rating of 2.91. The ANOVA test in Table 9 shows a statistically significant relation between the institutions, and the Scheffe's post hoc test in Table 10 infers that the difference is only between Rivers State University and Ignatius Ajuru University. This result is not surprising looking at the biographic information. Rivers State University has 14 PhD students who participated in the study while Ignatius Ajuru University had none. This further justifies research question 3 and Ho 2 which shows that the higher the educational level, the more the data literacy skills possessed by graduate students.

Conclusions
Overall, this study provides insight into the level of data literacy skills possessed by graduate students in Southern Nigeria. It was found that graduate students had moderate levels of data literacy skills. Also, the male graduate students had more data literacy skills than their female counterparts. Higher degree (PhD and Masters) students had higher data literacy skills as compared to those with lower degree (PGDE and First degrees).This study shows that despite the importance of data literacy skills such as data collection, hypotheses statements, data analysis and data visualization and interpretation, the graduate students especially those with first degrees and PGDE appeared not to possess enough of those skills. It is therefore recommended that data literacy workshops should be organized for graduate students on regular basis to help them acquire requisite data literacy skills. This will help in bridging the gap between knowledge and practice through a more hands-on, active learning opportunity. This will enable students to cope with the global standard and also become very useful to their societies.