Forecasting Electricity Consumption Using SARIMA Method in IBM SPSS Software

Forecasting is a prediction of future values based on historical data. It can be conducted using various methods such as statistical methods or machine learning techniques. Electricity is a necessity of modern life. Hence, accurate forecasting of electricity demand is important. Overestimation will cause a waste of energy but underestimation leads to higher operation costs. Univesity Tun Hussein Onn Malaysia (UTHM) is a developing Malaysian technical university, therefore there is a need to forecast UTHM electricity consumption for future decisions on generating electric power, load switching, and infrastructure development. The monthly UTHM electricity consumption data exhibits seasonality-periodic fluctuations. Thus, the seasonal Autoregressive Integrated Moving Average (SARIMA) method was applied in IBM SPSS software to predict UTHM electricity consumption for 2019 via Box-Jenkins method and Expert Modeler. There were a total of 120 observations taken from January year 2009 to December year 2018 to build the models. The best model from both methods is SARIMA(0, 1, 1)(0, 1, 1). It was found that the result through the Box-Jenkins method is approximately the same with the result generated through Expert Modeler in SPSS with MAPE of 8.4%.


Introduction
Electricity is the most adaptable type of vitality and comprises one of the imperative infra-auxiliary contributions to financial improvement. Overestimation or underestimation of the electricity consumption would lead to superfluous idle capacity which means wasted financial resources or causes higher operation costs for energy supplier and would cause potential energy outages. Therefore, modelling electricity consumption with good accuracy becomes vital to avoid costly mistakes and another unneeded effect. Forecasting is mean to predict the future via a set of statistical tools and techniques that are supported by human judgement and intuition. Forecasting is the use of a set of historical data to do an analysis and finally predict future data from the trend [1].
UTHM is a developing Malaysian technical university, hence, there is a need to forecast UTHM electricity consumption for future decisions on generating electric power, load switching, and infrastructure development. Its monthly electricity consumption data from January year 2009 to December year 2018 was collected. The data exhibits seasonality-periodic fluctuations, therefore SARIMA was chosen to forecast the year 2019 consumption. SARIMA not only successfully forecasted electricity consumption but also dengue occurrence [17] and foreign tourism at the airport [18]. The best model of SARIMA following Box-Jenkins [19] methodology will be compared with Expert Modeler in IBM SPSS software.

SARIMA
If a time series exhibits a seasonal pattern, then SARIMA of order (p, d, q) (P, D, Q) S as below will be used: 104 Forecasting Electricity Consumption Using SARIMA Method in IBM SPSS Software where , φ θ are parameters of autoregressive (AR) and moving average (MA), while , Φ Θ are parameters of seasonal autoregressive (SAR) and seasonal moving average (SMA) respectively. Meanwhile, p, d, q are the orders of autoregressive, AR(p), difference and moving average MA(q), whereas P, D, Q are the orders of seasonal autoregressive SAR(P), seasonal difference and seasonal moving average SMA(Q) respectively. S is the seasonal length, t y is predicted variable and ε t is a random error at time t.

Materials and Methods
In this study, two approaches to implement SARIMA which are Box-Jenkins methodology and Expert Modeler in IBM SPSS will be discussed. The monthly UTHM electricity consumption from January year 2009 to December year 2018 was collected from Development and Maintenance Office, UTHM.

Box-Jenkins Methodology
Box-Jenkins methodology involves 5 steps as follows:  the data characteristics,  model identification,  parameters estimation,  diagnostics for getting the best model,  forecasting

Expert Modeler
The expert modeler is a black box tool provided in SPSS to implement SARIMA in a few clicks without manually going through the Box-Jenkins methodology as mentioned above.

Performance Evaluation
Both performances of the Box-Jenkins method and Expert Modeler SPPS will be evaluated in terms of mean absolute percentage errors (MAPE) as follows: where n is the number of data, i y and ˆi y are real and predicted values correspondingly.

Results and Discussions
In this section, the results obtained from both Box-Jenkins methodology and Expert Modeler SPSS will be presented.

Box-Jenkins Method
The results obtained through the 5-step Box-Jenkins methodology will be discussed as follows: The electricity consumption for certain months was lower than usual may because the month was mostly semester break of UTHM. There were fewer students in the campus and created lower electricity consumption if compared to the months that were not semester break.    From the ACF plot, it can be clearly seen that the UTHM electricity consumption from the year 2009 to 2018 was not stationary. It is because nonstationary series have an ACF that remains significant for half a dozen or more lags, rather than quickly declining to 0 [20]. Hence first differencing needed to be done. The time series plot of monthly electricity consumption of UTHM from January 2009 to December 2018 with d = 1 was created as shown in Figure 5.   From figure 6, there is a significant autocorrelation at 12 lags (lag 12 and lag 24) which indicates that seasonality of data needed to be considered [21]. Hence first seasonal differencing needed to be done. The time series plot of monthly electricity consumption of UTHM from January 2009 to December 2018 with d = 1 and D = 1 was created as shown in Figure 8.

Model Identification
From the ACF plot in Figure 9, there is significant autocorrelation at Lags 1, 10 and 11 which will be the value of q. While the significant autocorrelation at Lag 12 indicates that the value of Q = 1.
From the PACF plot in Figure 10, there is significant autocorrelation at Lags 1, 2 and 10 which will be the value of p. While the autocorrelation at Lag 12 presents but less significant, hence, it indicates that the value of P = 0 or P = 1.
In addition, since the PACF trails off after a lag and has a hard cut-off in the ACF after the lag q, hence, p = 0 also needed to be considered in the models.
In this study, the lags 10 and 11 on the order AR and MA were ignored. That is because when building a model using the 10th-order lag and 11th-order lag, they produce a model that is not simple. [18].

Parameters Estimation
The residual plots of ACF and PACF in Figure 11 shows the model is adequate as it shows a random variation from the origin zero (0), the points below and above are all uneven, hence the model fitted is adequate.  The forecasted plot of SARIMA (0, 1, 1)(0, 1, 1) 12 through Box-Jenkin method is shown in Figure 12 and the forecasted results of the year 2019 were tabulated in Table 2.  The forecasted part was enlarged for a clearer view as displayed in Figure 13.

Expert Modeler
The forecasted plot, enlarged forecasted plot and residual plots of ACF and PACF from Expert Modeler SPSS are shown in Figure 14, Figure 15 and Figure 16 respectively.

Comparison of Box-Jenkins Method with Expert Modeler SPSS
The comparison of both Box-Jenkins method and Expert Modeler SPSS included Parameter Estimator, Model Statistics, and forecasted result as shown in Table 2. It reveals results generated from the Box-Jenkins method and Expert Modeler are approximately the same.

Conclusions
From the result obtained, it can be concluded that the Box-Jenkins method and Expert Modeler in SPSS are eligible to forecast the UTHM electricity consumption for the year 2019. However, Expert Modeler will be more advanced as compared to Box-Jenkins method which needs to manually examine the most suitable model from numerous models identified with various testing while Expert Modeler can finish the forecast in just several clicks.
In this paper, the electricity consumption of UTHM of the year 2019 can be forecasted by using SARIMA method in SPSS through Box-Jenkins method and Expert Modeler with the real data taken from January year 2009 to December year 2018. The SARIMA model to forecast the UTHM electricity consumption is SARIMA (0, 1, 1) (0, 1, 1) 12 . The Mean absolute percentage error (MAPE) of the forecasted result is 8.4%.