Modelling Summer Daily Peak Loads in South Africa Using Discrete Time Markov Chain Analysis

Electricity demand exhibits a large degree of randomness in South Africa, particularly in summer. Its description requires a detailed analysis using statistical methodologies, in particular stochastic processes. The paper presents a Markov chain analysis of peak electricity demand. The data used is from South Africa’s power utility company Eskom, for the period 2000 to 2011. This modelling approach is important to decision makers in the electricity sector particularly in scheduling maintenance and refurbishments of power-plants. The randomness effect is accountable to meteorological factors and major electricity appliance usage. Aggregated data on daily electricity peak demand is used to develop the transition probability matrices, steady-state probabilities, mean returnand the first passage times. Such analysis are important to Eskom and other energy companies in planning load-shifting, load analysis and scheduling of electricity particularly during peak period in summer.


Introduction
Electricity demand exhibits a large degree of randomness, particularly in South Africa. South Africa is faced with the challenge to appropriately schedule power-station plant maintenance and insufficient potential to meet the electricity demand for consumers; which is mainly due to the meteorological influences, use of major electricity appliances and the demand also varies with consumers' behavior.
South Africa is heavily reliant on Eskom for the supply of Electricity. Eskom generates approximately 95% of the electricity consumed in South Africa and approximately 45% of the electricity consumed in Africa, [6].
Electricity is the most important commodity, a social advancement and a core driver of the economy. Hence the development of electricity infrastructure and refurbishment of the power-station plant is vital.
While huge investments have already been made on new power-station plants for generation of electricity, accelerated maintenance expansion is needed to unlock the growth potential of sectors such as Tourism and Agriculture. However, there is great uncertainty of consumers demand usage.
Despite there being sufficient generation capacity of electricity accommodating a large fraction of the South African population for now, the demand for electricity by consumers keeps on increasing due to population growth, rural electrification and industrial development. Electricity demand in South Africa vary from sector to sector, and the major driver is mainly temperature as discussed by [4]. Under these circumstances, South Africa's power utility company is faced with the challenge to keep up with the rising electricity demand. As noted by [2] the electricity demand varies with season. Demand is significantly higher in winter than in summer, but the individual behaviour has confirmed demand in summer can be significantly higher. Due to the heavy demand and stochastic fluctuation of the electricity demand by consumers in winter, summer is the appropriate time for the grid companies to perform restorations and maintenance on power-station plants.
The use of electricity major appliances (refrigerators, ventilators, air-conditioners, e.t.c) account for a large proportion of overall daily demand in summer. This becomes problematic for Eskom to perform maintenance and refurbishments on the power-station plants. The paper focuses on the use of Markov chain analysis to model summer daily peak demand to help schedule maintenance of power plants.
Scheduling maintenance means knowing the exact occurrences of extreme demand changes (electricity demand above sufficiently high threshold). This paper contributes towards understanding the impact of daily extreme electricity demand in South Africa during the summer season and the various interactions that exist between the various factors and variables that drive demand.
We model the summer extreme positive daily changes in peak electricity demand, using discrete time Markov chains (DTMCs) analysis. We calculate the transition probabilities, mean return times, first passage times and the steady state probabilities. Extreme Value Theory (EVT) is then used to calculate the summer extreme increases in peak electricity demand. This tool provides decision makers in the electricity sector to help in scheduling and planning and risk management.
The rest of the paper is organized as follows. Section 2 provides the models. A practical application and discussion of the tail quantile estimation and calculation of probability of exceedance levels using the Generalized Pareto Distribution (GPD). The last section concludes and provides short overview on the findings.

Discrete time Markov chains
A Markov chain, in general, is a way to describe a movement of state from one position to another. Additionally, discrete time Markov Chain (DTMC) analysis is a type of Markov chain in which a finite state space exists and one of the advantages of using the Markov chain models captures all dominant factors of uncertainty [9]. In detail, DTMC involves modelling transition probabilities of the daily peak electricity demand changes in different hours. Firstly, we define the two states; positive daily changes (increase) and negative daily changes (decrease), and the probability distribution of the transitions from one state to another is represented by a transition matrix P, where each element of position {i, j} represents the probability p ij .
Notably, this technique is characterized by the Markov property which is fully explained in section 1.C. Furthermore, we calculate the steady state probabilities, mean return-and first passage times of the daily peak electricity demand in different hours. Henceforth, the two-state problem is then extended to the three-state problem, where the daily electricity demand changes in different hours are split into the following states: small increases, extreme increases of the hourly peak demands and last, negative hourly changes are classified as decreases. Prior, this is done by fitting a Pareto quantile plot to the positive daily changes to determine a sufficiently high threshold (τ ) with all the exceedances regarded as extreme positive hourly demand changes and those below τ as small increases. Further, a transition probability matrix is developed from which the steady state probabilities, mean return-and first passage times are calculated. We use Generalized Pareto Distribution (GPD) to calculate extreme tail quantiles and Probability of Exceedance (PoE) of the extreme peak electricity demand in different hours.

Description of the data
The transition matrices, steady-state probabilities, mean return and first passage times are developed from the summer daily electricity demand (SDPED) historical data of electricity demand from Eskom (South African utility company). Eskom is responsible for facilitating and supplying electricity to the whole South African population (about 50 million people) and other parts of Southern Africa. The historical data comprises of the aggregated hourly data (in MegaWatts) for the period 2000-2010. Aggregated data comes from various sectors (manufacturing, mining, tourism, communication, tourism, transport, finance and business services, e.t.c.) that contribute to the well being of South Africa's economic growth.The winter period in South Africa is defined as the date from 15 May-14 August of every year and 15 August-14 May is considered the summer period. Figure 1 shows a time series plot of the summer daily peak electricity demand, and exhibits both a positive upward trend and strong seasonality. As clearly depicted in Figure 1, the data is not stationary. In view of making the data stationary there are several ways available and in this study we use the differencing. The data (summer daily peak electricity demand) is stationary after the first difference as shown in Figure 2(a). The first difference of the series is given as: where, z t denotes summer daily changes in peak electricity demand, and x t denotes summer daily peak electricity demand on day t.
In order for us to determine the states, we initially select the threshold to be zero and this results in a two-state problem with the states defined as:

Markov Property
The Markov property states that the distribution of the forthcoming state X n+1 depends only on the current state X n and does not depend on the previous ones X n1 , X n2 , ..., X 1 . This is a sequence of random variables characterized by a memorylessness property. Correspondingly, the transition probabilities of an increase or decrease of summer daily peak electricity demand (SDPED) depend only on the current state (either a decrease (state 2) or an increase (state 1)) and not on past evolution of DPED. This can be represented mathematically as:

Estimation of transition probabilities (daily changes)
The chain successively moves from one state to another (this change is named either 'transition' or 'step'), [7]. Let, p ij = the probability of moving from state i to state j , where i, j = 1, 2. Mathematically this can be represented as: where, state 1: increase (1) state 2: decrease (2) The two state transition matrix is given as: It can be observed that if we let p 12 = γ then p 11 = 1 − γ. Equally, if we let p 21 = ψ then p 22 = 1 − ψ. The transition matrix above can be written as: The state probabilities of the transition matrix, p ij are estimated using the formulae, where n ij is the number of observed transitions from state i to state j and n i = ∑ n i=1 n ij . The set of all possible states S = (S 1 , S 2 ) of the summer daily peak electricity demand is the state space of the chain.

n-step transition probabilities
The entries, p ij in the transition matrix P represent the transition probabilities. Then the n−step transition probability of a Markov Chain is defined as the probability that the system goes from state i to state j in n steps. Consequently, the n-step transition probability is given by the equation Then the associated 2-step transition probability of the two-state problem is given by; Notably, as n approaches infinity, P ( n) ij converges and begins to steadily settle to some limiting value (η), independent of some initial value. η i is the limiting steady probability of state i, and the steady state probability of the Markov chain can be represented mathematically as: Now for the two-state problem, the steady state probabilities of the states is given as:

Mean Return times
For an ergodic (irreducible) Markov chain system, the mean return time of state l is the first visit time T l after a number of transitions. This is represented mathematically as: Modelling Summer Daily Peak Loads in South Africa Using Discrete Time Markov Chain Analysis where η = [η 1 , ..., η r ] is the stationary probability vector of the transition probability matrix P, and r is the number of states.

First Passage times
The "first passage time" (FPT), (T ij ) from state i to state j is the number of transitions (steps), taken by the Markov chain until it arrives for the first time at state j given that the initial state was i, (X 0 = i). The probability distribution of the first passage time is given by equations (10) and (11).
Now for n = 1 in equation (10) we have, and for n ≥ 2, the first passage time is given by:

Empirical results and discussion
Using the R-statistical software called "Markovian" package developed by [7], the transition matrix for summer daily peak electricity demand (SDPED) for states 1 and 2 is: As clearly depicted by the transition matrix P, the probability that the current peak electricity demand changes from state 1 (increase) and back to state 1 again the next day is 0.4295830. In detail, there is an approximate 43.0 % chance that the system experiences an increase from state 1 and back to state 1 again, and while the probability of moving from state 1 to state 2 is 0.5704170. The transition matrix P indicates, the probability that the system moves from state 2 (decrease) and back to state 2 again is 0.6153029. This kind of information is crucial in proper planning, analysis and facilitation of power system operations in order to ensure that outages, blackouts and load shedding do not occur. The steady state probabilities of the two-state problem are: The results indicate that, given the current state of the system is an increase, the system can expect to experience another increase in about 2.5 days (approximately 60 hours). Similarly, given the current state of the system is a decrease (state 2) then we expect to have another decrease in about 1.7 days, (approximately 41 hours). A short-term summary of the first passage time probabilities (for the next seven days) is depicted in Table  1. The results indicate, given that the current state of the system is state 1, for the next upcoming seven days the probabilities of the system to continue to experience state 1 (increase) is increasingly decreasing exponentially. Meanwhile, given the current state of the system is state 1, the probability is 42.9% that the next day the system will be in state 1 again and meanwhile the probability that the system will be in state 2 is 57.0%. Notably, this information is very important to the forecasters to determine which periods are most appropriate to undertake maintenance and monitor the security of the national grid. It should be noted energy security is very important for the economic growth of any country [9]. In Table 2, a summary of the short-term first passage time probabilities (for the next seven days) given the current state of the system is state 2 is depicted. The probabilities for the first passage times exhibited by the system for the next upcoming seven days show sign of decreasing. The probability that the system will be in state 2 (decrease) during the first day is approximately 62% and the second day is approximately 21.9%.   Figure 3 shows that the shape of the two states (increase and decrease) distributions of positive daily changes is similar. The distribution of the probabilities exhibits some exponential distribution with time (days). Similarly figure 4 shows that the shape of the two states (increase and decrease) distributions of positive daily changes is nearly similar. The distribution of the probabilities exhibits some exponential distribution with time (days).

Modelling extreme peaks for the two-state problem
We fit a Pareto quantile plot to the positive summer daily changes to determine a sufficiently high threshold, τ . All the observations above the determined threshold (exceedances) are then defined as extreme daily positive changes which will be classified as (state 1) and for all the observations less than or equal to the threshold (τ ) as (state 2) , i.e z t τ .
A Pareto quantile plot is defined as the scatter plot of the following points:(−log(1 − p i ); logz i ) where p i = i n + 1 and z i are the daily changes in summer peak electricity demand, for all i = 1, ..., n. Figure 5 shows that the threshold, τ is given by τ = exp(7.8846) = 2656.1 MW with 164 number of observations in the tail.
The mean return times for the two states are:  Given the current state is an extreme increase, Table  3 shows the probabilities of change in the two states( 126 Modelling Summer Daily Peak Loads in South Africa Using Discrete Time Markov Chain Analysis state 1 and state 2). It is less intuitive, but true that the probability of having another extreme increase the following day is 0 while there is absolute certainty that the following day will result in no extreme increase. Given the current state is no extreme increase, Table 4 shows that the probability that the next day we will observe an extreme increase is 0.05135045 while the probability that it will be in the same state (no extreme increase) is 0.94864950.

Three-state Problem
We extend the two-state problem to the three-state problem. The three-state problem involves categorization of positive and negative summer daily changes. The positive daily changes are categorized into two states; small changes and extreme daily changes and the negative daily changes are categorized as decreases. The three states are defined as: The mean return times of the three-state problem is given as: This gives approximately 94 days in the whole of summer period.
M R 3 = 1 η 3 = 1 0.5969009 = 1.7 days for the decrease state (z t 0). This gives approximately 160 days in the whole of the summer session. Table 5 shows that given the current state is an extreme increase, the probability that the next day we will observe a small increase is 0.51515150 and the probability that we will observe an extreme increase is 0.00000.  Given the current state is an extreme increase in the system, the probability that the next day we will observe an extreme increase in summer daily changes is 0 as shown in Table 6 and on day 2 is 0.05021167.  Table 7 shows that the probability that the next day we will observe an extreme increase is 0.03342367.   Figure 7 present the first passage time, given the current state is a decrease (state 3), the probability distribution of state 1 and state 2 decrease exponentially for the next 7 days. Figure 6 present a first passage time given current state of the system is state 2 (small increase), the probability distribution shape of state 1 (small-increase) and state 3 (decrease) seem to be decreasing exponentially and state 2 is slowly increasing decreasingly.

Probability of exceedance levels using Generalized Pareto distribution (peaks-overthreshold distribution)
We consider the peaks (summer daily peak electricity demand) above the threshold τ = 2656.1 MW. There are several ways to model the observations above the threshold (exceedances). We use the Generalized Pareto Distribution (GDP) to model the observations above the threshold discussed in [3]. Using the Maximum Likelihood method discussed in section 3, we will estimate the parameters (σ, ξ). Using R statistical package "ismev" by [5]. The estimates and their respective standard errors are given in Table 8. From Table 8, we can see that the standard errors for the parameter estimates are small. This confirms that the level of uncertainty in estimating the parameters is small. The 95 percent confidence intervals for the parameters σ and ξ are (σ ± Z α/2 S.E) = (305.478; 456.237) and (ξ ± Z α/2 S.E) = (−0.3282; −0.06986) respectively.
Since the shape parameter ξ < 0 then the Generalized Pareto Distribution of the excesses belongs to the Weibull class of distributions (Beta Distribution). The distribution has a short tail with finite upper end point where p is the upper tail probability andẑ p is the return level with period of 1/p. Making z p subject of the formula we get, The diagnostic plots of the exceedances above the threshold τ = 2656.1 MW are shown in Figure 8. Figure  8 shows that the GPD is a good fit to the exceedances.  Table 9 shows the number of exceedances relative to the corresponding tail probabilities. At 99 th percentile, the value is 3801.93. The number of observations that are greater than the estimated tail quantile (Z 0.05 = 3801.93) are then counted and found to be 3 for the number of exceedances, we get 0.05 * 164 = 8.2 ≃ 8, where 164 is the number of observations above the threshold.  Figure 9 present the frequency analysis of the summer daily peak electricity demand changes over the sampling period of 12 years. Clearly in South Africa, extreme summer daily changes occur mostly in April and the least daily changes are observed in December. More importantly, an accurate assessment of the frequency and analysis of summer daily peak loads help system operators and decision makers in determining the critical peak days, scheduling maintenance, load flow analysis and dispatching of electricity demand.

Conclusion
The paper discussed a Markov chain analysis of summer daily changes in peak electricity demand. Empirical results show that the steady state probability of an extreme increase (z t > 2656.1 MW) in daily peak electricity demand is 0.05233111. This resulted in a mean return time of about 19 days. We extended the twostate problem to the three-state problem. The steady state probability of an extreme increase was calculated as 0.05234204 with a return period of 19.1 days, giving approximately 14 days of extreme increases in the summer period. Extreme daily electricity demand changes occurred mostly in April while December had the least occurrence of extreme daily changes for all summer months. December appears to be an appropriate time to carry out maintenance on power-station plants. Such an analysis is important; they help system operators and decision makers in determining the critical peak days in order to schedule maintenance avoiding unnecessary power outages.
Discrete time Markov chain analysis is very important in modelling electricity demand, however further research is necessary. Areas for further research include incorporating various economic variables (Gross Domestic Product), hourly electricity demand, natural gas price and involve long-term predictions of the electricity demand.
Comparative analysis with the Bayesian inferential estimates will be done as well in the future.