Deterministic and Probabilistic Weather Forecasting: An Analysis t owards Big Data Samples Using the Google Search Random Surfer

The concept of big data has become one of the most important topics in the field of information science and engineering. In this paper, we offer modeling of data and its stability and forecasting by considering anti-symmetric traceless and symmetric models for atmospheric pressure variations. The data sample has been collected every 10 minutes for several years during 2009-2016 at the Weather Station, Max Planck Institute for Biogeochemistry, Jena, Germany. Subsequently, we extend the proposed model with a probabilistic transformation matrix by considering the Google search random surfer matrix with a small damping factor 𝑝𝑝 (0 < 𝑝𝑝 < 1) . Following the Principal Component Analysis (PCA), our study plays a vital role in big data samples and their stability analysis. A comparative discussion is provided for the above transformation matrix and its probabilistic counterpart. Finally, predictions are made towards feature selection, PCA and data compression sensing in the light of big data.


Introduction
Forecasting is an important problem concerning the weather and its modeling [1]. Following the same, in this paper, we provide modeling of data samples in the light of their stability and forecasting. Our focus is towards the study of big data samples. First of all, we consider that a big data sample is large and it has complex data structures. The concepts of big data sets comply with its main qualities such as the volume, velocity, variety and veracity [2]. On the one hand, a big data sample can have a high dimensionality [3]. Such challenges are considered through our modeling techniques, see [2] for an introduction towards big data.
Big data configurations hold great promises for discovering subtle patterns in population sciences and heterogeneities that are usually not possible in a small scale data size. Thus, a large sample size that involves a vast volume and high dimensionality of big data requires unique computational and statistical challenges including the scalability, storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity and measurement errors [4]. Furthermore, the modeling of data and its stability are important in predicting the future behavior of configuration. In this direction, we have studied the Richadson Integration over fluctuations of its step sizes for arbitrary real valued integratable functions [5].
The present investigation supports data modeling that arises as the process of creating data samples towards the optimal information system designing and applications. Our focus includes development of a model that determines possible atmospheric pressure variations. In this concern, Barometric pressure and atmospheric pressure fluctuations are among our future studies. We also focus on anti-symmetric traceless and symmetric models and their relation to convex analysis techniques [6]. Here, the convex analysis plays an important role in the light of optimization theory. Namely, the convex analysis simplifies the optimization with respect to constrains of the concept of convexity into the problem. This is also named as convex optimization as it is optimizing the convex function. For a convex function, all local minima is same as global minima. It is because the minimization method is based on the analysis of convexity properties of a distance function [7].
Existence of the global minimum via the convex function techniques are defined by a convex set [8]. This is characterized by directions of the flow of the objective function. For determining its global minima, a nonconvex function can be convexified [8]. A convex set is connected in its nonempty relative interior and it has feasible direction at any point. A real-valued convex function is continuous and it has good differentiability properties. Herewith, closed convex cones arise as the self-dual object with respect to the polarity [9]. In the light of convex analysis, lower semi-continuous functions can equally be treated by self-dual objects with respect to the conjugacy. Here, linear problems are often solved by convex methods and vice-versa [10,11]. Concerning the polyhedral convex sets, the duality concept plays a vital role.
On the other hand, the probabilistic inference is the task of deriving the probability of one or more random variables taking a specific value or set of values. Probabilistic inference uses stochastic models to classify the problem by minimizing the loss function [12]. The problem is to determine its efficiency in making predictions. Hereby, one uses a distance function in the light of linear difference model. Further, fuzzy logic provides a viable principle of approximate reasoning with a limiting case. It has many-valued logic in which the truth values of variables may be any real number between 0 and 1 inclusively. This defines the notion of fuzziness. Fuzzy logic could thus be used to predict chaotic behavior [13].
Further, in the realm of non-convex theories, there are gradient descent and Quasi-Newton method and others that are used to minimize the non-convex errors of continuous functions in a high dimensional space. A trust region approaches are used to determine the second order methods to non-convex problems. In such methods, in order to remove a negative curvature, the Hessian techniques are defined as a damped configuration. This is realized by adding a constant to the diagonal of its Hessian matrix. This is equivalent to adding a constant to each of its eigenvalues [14]. Hereby, from the outset of this paper, the convex analysis and convex programming are investigated on the same footing of the optimization theory.
In the light of optimization problems, finding the best solution from all possible solutions is among the main concerns of the present research. Depending on the number and the type of variables of the problem, optimization can be either continuous or discrete. In the case of a discrete optimization, the approach is for an integer, permutation or graph from a countable set. In a continuous problem, the approaches utilized include free optimization problems. Constrained optimization problems use the Lagrange techniques and other multimodal optimizations such as the gradient method, standard genetic algorithms, particle swamp and artificial bee colony methods [15,16].
In the light of forecasting and data analysis, Google search random surfers are used as a matrix for various probabilistic transformations. Namely, Google matrix is among one of the most accurate methods that is utilized for predictions. It was created as the result of dangling and disconnected nodes [17]. Here, the nodes refer to the considered websites that are visited during the search. In practice, a Google surfer matrix generates the best optimal prediction by using a large matrix of a considered data with reference to their connectivity as outlined in the next section. Based on the worldwide web server searching, weather is a very diverse phenomenon in its characters and predictions. It spreads among huge uncertainties where data points may not be connected to each other. Following the same, we propose a model for exploiting the Google surfer matrix towards the forecasting of weather [18].
In this research, instead of web sites we take atmospheric pressure variations to make predictions. We have downloaded a data sample as taken from Kaggle data science server [19]. The data has been collected every 10 minutes for several years during 2009-2016 at the Weather Station, Max Planck Institute for Biogeochemistry, Jena, Germany. We consider the difference of the atmospheric pressure to create the transformation matrix in parallel with the Google surfer matrix. For a better prediction, we define a probabilistic transformation model using transformation matrix with a damping factor. With the help of PCA, we provide the prediction of atmospheric pressure by obtaining the maximum and minimum eigenvalues of the probabilistic transformation model.
The rest of the paper is presented as follows. In section 2, we offer an overview of the model with convex optimization, Google surfer matrix and stochastic optimization. In section 3, we proposed two models based on anti-symmetric traceless and symmetric matrices. In section 4, we generalize models in the light of optimization theory. In the section 5, we provide verification of the models with the evaluations of the results. In section 6, we discuss our conclusions and future directions for further research and developments.

Review of the Model
In this section, we present an overview of the proposed model. Let's consider a generic form of continuous problem defined as the objective function that is to be optimized where we can apply convex programming and Google surfer matrix.

Convex Optimization
To study the convex optimization techniques, we introduce a cost vector with its transpose Therefore, the underlying objective function is defined as 84 Deterministic and Probabilistic Weather Forecasting: An Analysis Towards Big Data Samples Using the Google Search Random Surfer = 1 1 + 2 2 + ⋯ + .
Here, the variables 1 , 2 , are variables with respect to which the objective function is sought to be optimized. As a minimization problem, the above objective function reads as ( ) such that ≤ .
In this setting, we can optimize the nodes through the following notations �.
When we have equality, we solve the problem by applying Gauss elimination method and get = −1 . In the case of ≥ , we use the Fourier-Motzkin elimination techniques [20] to find the optimal results through Google surfer matrix models.

Google Surfer Matrix
In order to illustrate the Google surfer matrix, let's consider an example of four nodes as in Figure 1. By assigning weights to each edge as below, we get the linear algebraic equations as �.
In the matrix notation, it can be expressed as where ′ is the output, is transformation matrix and is the input. In this system, the relevant results are found after surfing for a very long time, where there is no way of reducing/removing the outliers [2].

Stochastic Optimization
In order to remove above disadvantages of outliers, we analyze transformation matrix with probability , where ∈ [0, 1] . Here, is called damping factor or bargaining factor. Instead of the transformation matrix , we define the objective function with a probabilistic transformation matrix . In this case, in the light of the Google surfer matrix, is also known as the Page Rank matrix [18] that is defined as where the matrix is the same as before in the foregoing subsection and the matrix reads as The purpose of having as the matrix with its all elements unity is to avoid outliers [2]. Here, is the number of nodes or the size of the transfer matrix.
For example, for the system as considered above the matrix can be defined as In order to make reliable predictions, we need to find the eigenvalue of as a function of . In special cases such as for a small ( ≈ 0), the eigenvalue of is equal to eigenvalue of . It's called stochastic PCA. In such a transfer matrix, the eigenvalues will transform the matrix into linear system and tell about the stability of the data. Hereby, we obtain the minimum and maximum eigenvalues of the transfer matrix .

Generalization of the Model
In this section, we provide the generalization of the symmetric and anti-symmetric traceless models in the light of the optimization theory as below. With the optimization formulation as in section 2, we may redefine the problem as Here, the norm of reads as In this setting, our proposed optimization reads as the minimization of distance Hereby, the above equation generates a linear regression model.
Further, the optimal feature selection and data compressed sensing are done via the stochastic transformation matrix Define a new parameter Herewith, the concerned optimization problem can be redefined as = + .
In terms of the norms, this simplifies as follows

This is termed as the LASSO method [21]
Herewith, it is observed that a probabilistic model is same as the LASSO model in the light of feature selection [21,22]. Such directions are performed in the light of our proposal as below.

Proposed Models
In this section, we provide two models to analyze the data sample. This allows to finding the spreading through the maximum and minimum eigenvalues of the transformation matrix. These models support forecasting and dynamical behavior of the sample. Predictions are made towards the atmospheric pressure determination.

Model 1: Anti-symmetric Configuration
Using atmospheric pressure variations, we build predictions based on an anti-symmetric traceless matrix as the following linear difference model. Considering the matrix elements The transformation matrix reads as = � − � .
Following the same, we have the following transformation matrix

Model 2: Symmetric Configuration
Similarly, for the case of a positive definite matrix , we define the elements as the absolute difference = � − �. Hereby, it follows that the transformation matrix reads as the following symmetric traceless matrix In both above models we observe that there are no autocorrelations, that is, we have = 0. Consequently, our model predicts correlations between two distinct observation points.

Verification of the Models
From the data sample as in [19], we have made three sub samples as Data Set 1, Data Set 2 and Data Set 3 that contain the consequent first thousand, second thousand and third thousand from the main data sample.
After creating a matrix , with Python libraries named Pandas and Numpy, we have obtained the maximum and minimum eigenvalue of the stochastic transfer matrix with = 0.33 for the above three dataset.
Here, in order to reduce the memory size, we convert the data type of the atmospheric pressure measured in the unit of pmbar to np.float32 unit as for both the proposed anti-symmetric and symmetric models.

Model 1: Anti-symmetric Configuration
For the case of the anti-symmetric traceless matrix as above in section 4.1, using Python, we find the maximum and minimum of the eigenvalues of as below: Data Set 1: In this framework, we see that the maximum eigenvalue is quite large in comparison to minimum eigenvalue of . The imaginary components show that there is a damping in the chosen model. The real component of the maximum eigenvalue is nearly same for all the data sets. The minimum eigenvalues have different orders.

Model 2: Symmetric Configurations
For the case of the symmetric traceless matrix as in above section 4.2, we obtain the maximum and minimum of the eigenvalues of as Data Set 1: We observe that both the maximum and minimum eigenvalues are of different orders. There is a large difference between the minimum and maximum eigenvalues of . The absence of the imaginary components shows that there is no damping in this model.

Conclusions
In this paper, we have discussed the optimization of a continuous problem in the light of atmospheric pressure variations. Our model uses Google search matrix as it is one of the most efficient and accurate algorithm. We have considered transfer matrix and optimization of big data samples. Hereby, notice that a large matrix calculation requires applications of fast computers, high memory capacity and advanced mathematical formulations, the Google search matrix can be used in forecasting of the optimal value of the weather.
In particular, for both anti-symmetric and symmetric traceless matrix of the probabilistic transformation models, we have respectively complex and real numbers as the minimum and maximum eigenvalue of the transfer matrix. In the case of anti-symmetric traceless transfer matrix, we get an oscillatory system with damping. It has the effect of preventing its oscillations by reducing its amplitude. Hence, the anti-symmetric traceless configuration is not stable.
On the other hand, configurations with a symmetric traceless transfer matrix are stable. In sort, anti-symmetric traceless configurations are rather useful towards the prediction of the atmospheric pressure and related forecasting of the weather. Related numeric analysis and development of algorithm [23] are left open for future research and investigations.

Appendix: Eigenvalues and Eigenvectors of a Matrix
In this appendix, we introduce the eigenvalues of a matrix and the norm of an eigenvector. Namely, let A = � a b c d � be a 2x2 matrix, then its eigenvalue λ is found by solving the equation where is a two dimensional vector. That is, we solve the equation (a − λ)(d − λ) − bc = 0 λ 2 − (a + d)λ + (ad − bc) = 0.
As per the above equations, trace is the sum of the diagonal elements, that is, we have tr (A)= a + d and its determinant det (A) is equal to the quantity ad − bc. In other words, it follows that we have the following determinant ‖A‖ = ad − bc. Therefore, the above characteristic equation simplifies as Similarly, the given eigenvector with its transpose the respective norm of X is defined as In other words, the above norm reads as the summation The same is used in probabilistic optimization models, e.g. the LASSO model.