Traffic Flow Prediction in Urban Area Using Inverse Approach of Chaos Theory

Traffic congestions problem could affect everyday life especially in urban area. In order to solve the issue, an excellent traffic flow prediction needs to be developed for a better traffic management. Hence, this study was conducted in order to predict traffic flow by using the data of total volume of vehicles per hour at two main roads located in urban areas namely Selangor and Kuala Lumpur, Malaysia by using application of chaos theory. Phase space reconstruction was used to determine the chaotic behaviour of the total volume of vehicles per hour data. T he reconstruction of phase space involves a single variable of the total volume of vehicles per hour data to m -dimensional phase space. Meanwhile, the inverse approach as well as local linear approximation method was used to develop prediction model of the traffic flow time series data. This study found that (i) the time series data were chaotic behaviour based on the phase space plot and (ii) inverse approach can provide prediction on the traffic flow time series data besides give excellent prediction with the value of correlation coefficient more than 0.7500. Hence, inverse approach of chaos theory can develop to prediction model towards the traffic flow in urban area; thus may help the local authorities to provide good traffic management.


Introduction
Traffic congestion is a normal scenario to be seen in urban areas while it can also endanger human and environment and can cause pollution [1]. Traffic flow prediction is important for traffic management as it provides accurate information for excellent traffic system [2]. Thus, efficient prediction of traffic flows can therefore help to monitor and analyse urban traffic needs and give citizens better choices of public transport [3]. As such, a prediction model that can give an accurate prediction of traffic flow has become a crucial need nowadays which gives a strong reason for its study and development.
Traffic flow or the total number of vehicles crossing a particular point per unit time period is a point process [4]. Traffic flow is a continuous phenomenon. The irregular patterns in the data show the complexity of the traffic system. The system is influenced by several variables such as volume, speed, density, travel time, and headways, which are important in traffic planning and design processes [5]. Hence, the multivariable analysis approach such as Bayesian network (BN) [6], artificial neural network (ANN) [5,7], auto-regressive integrated moving average (ARIMA) [4], deep neural network [8] and hybrid traffic prediction scheme using ARIMA-ANN [1] have been widely used in traffic flow analysis. These models and methods view traffic flow as random process. However, chaos phenomena illustrate the essence of complex process as not random, but it is caused by the chaos in nonlinear dynamics systems [2,9].
Chaos approach involves reconstruction of phase space of the vehicle per hour time series data to multi-dimensional phase space. Then, the outcome from phase space reconstruction will be used in the prediction phase. The application of chaos approach in modelling the prediction of traffic flow time series data in Malaysia is still in the early stage. Until today, chaos approach has been applied to meteorology and hydrology areas in predicting ozone concentration and river flow in Malaysia [10][11][12] but no study on traffic management using chaos approach has been conducted.
There are many researches on chaos approach in traffic flow time series data have been carried out in some countries such as China [2,13], Iran [14] and Los Angeles [15]. Therefore, chaos approach needs to be applied to determine the suitability of this approach to the traffic flow time series data in Malaysia. Hence, the focus of this research is to determine the chaos existence and hence predict the traffic flow by investigating the volume of vehicles per hour in urban areas. This study used data from 22 nd September 2014 until 28 th September 2014 that were taken hourly. The whole 7-day dataset was divided into two parts; (i) training set: the first 6 days of data and testing set: subsequent 1 day. The training set was used to reconstruct the phase space meanwhile the testing set was used as observed data in the prediction phase. The variations of the total vehicles per hour that went through Klang -Sabak Bernam path (BR807) and Kuala Lumpur -Karak path (BR902), respectively which consisted of 168 data in each path showed in Figure 1 (a) and (b). The highest vehicles' volume that went through Klang -Sabak Bernam path in both directions reached 80000 vehicles per hour. Meanwhile, the highest vehicles' volume that used Kuala Lumpur -Karak path in both directions reached less than 20000 vehicles per hour. According to graphs in Figure 1, data on traffic flow were in seasonal pattern where the existence of peak and off-peak vehicles volume happened at almost the same time interval. Therefore, it is important to determine the chaotic existence in this data and hence predict the traffic flow in these two urban areas.

Chaotic Behaviour Determination Using Phase Space Plot
Time series data with chaotic behaviour were divided into two categories which are deterministic and random [17]. Deterministic time series data can be predicted while random time series data are unpredictable. The main reason for detecting chaotic behaviour is to develop short term prediction model [18]. There are several methods used to determine the existence of chaotic behaviour on various time series data such as phase space plot [19], correlation dimension [20] and Cao method [10]. However, this study used phase space plot in order to determine the existence of chaotic behaviour in traffic flow time series data that were observed hourly. The phase space plot has been chosen in this study because chaotic behaviour can be identified clearly if there exists attractor in the phase space plot.
An appropriate phase space needs to be reconstructed in order to develop a phase space plot. Phase space reconstruction refers to one-dimensional univariate time series data which is the number of vehicles. Suppose that is the number of vehicles and constructed in the form of: (1) where is the number of vehicles per hour for t = 1,2,3, …, N . The is total number of data that is The phase space can be reconstructed into m-dimensional phase space with: (2) with where Refer to (2), parameter is the time delay and is the embedding dimension. In order to plot 2-dimensional phase space plot, the value of is fixed as 1 and The system is said to be in chaotic behaviour or we can say that the chaotic exists in the system if there exists attractor in the plotted phase space [21].

Prediction Using Inverse Approach of Nonlinear Prediction Method
The inverse approach by Sugihara and May [22] was used in predicting the number of vehicles per hour since this method does not require a lot of data [23]. This method is convenient to this study as this study involved 168 data. Furthermore, inverse method could help to calculate the dimension in order to determine a possible optimal dimension. An optimal embedding dimension can propose the possibility of low-dimensional chaos in a system [21]. Optimal dimension by inverse method can be defined by varying the values of from 2 to 10 in (2). The optimal value of is determined when the prediction gives the best value of correlation coefficient (CC).
The local linear approximation method has been used in prediction process. The time series data on the number of vehicles, is divided into two parts which are that are data on the first six days and that are data on the seventh day. The is used for prediction process. Meanwhile, the is used to test the accuracy of the prediction results. Next, the relationship between the current state and the future state needs to be fulfilled using: In order to have an appropriate expression for , the local linear approximation by Farmer and Sidorowich [24] will be used. The least square method has been used to determine the parameters and in order to predict using local linear approximation method using: (4)

Results and Discussion
Reconstruction of phase space using and has been developed as referring to (2) in order to plot 2-dimensional phase space. Both Figure 1 and Figure 2 portray the behaviour of the data that is the behaviour of traffic flow where Figure 2 shows the phase space plot in determining the chaotic existence for data in both stations. The evolution of the overall data has been transformed to 2-dimensional phase space as shown through the phase space trajectories of an attractor as seen in Figure 2. Refer to Figure 2, the phase space plot showed that there exist a well define attractor between x-axis and y-axis. Hence, the chaotic behaviour is present as there exists a well-defined attractor as referred to the research by Sivakumar [21]. Therefore, the behaviour is categorized as chaotic behaviour.  Figure 3, observed time series data as well as prediction result were plotted against time (hourly). The prediction result gave well accuracy from beginning to the end of prediction by comparing it to observed data. The prediction phase was conducted using inverse approach and local linear approximation method while the performance of the results was presented in terms of correlation coefficient (CC) in Table 1. The embedding  dimension for both paths at BR807 and BR902 gives the best prediction with values of CC > 0.7500. Hence, an optimal embedding dimension that refers to the value of proposed that there is a possibility presence of low-dimensional chaos behaviour in a system [21]. Therefore, the combination of inverse method and local linear approximation method is suitable to predict traffic flow in urban area.

Conclusion
Traffic flow in urban areas was represented by total vehicles per hour that were located at two paths in Selangor and Kuala Lumpur respectively, which involved a total of 168 time series data that were investigated using chaos approach. The chaotic existence on the data set was determined using phase space plot where specific pattern of attractor occurred in boths paths. The prediction on the total number of vehicles was conducted using the combination of inverse approach and local linear approximation method at optimal embedding dimension, which gave an excellent prediction of CC > 0.7500 for each station. This study showed that inverse approach is suitable to predict traffic flow in Malaysia and hence can be applied in traffic flow time series data set in forecasting the traffic flow.