Kalman Filter: A Simple Derivation

The Kalman filter is a recursive estimator and plays a fundamental role in statistics for filtering, prediction and smoothing. The key element in any recursive estimator is the estimate of the current state, xk, at time k, based on observations up to and including observation k and the Kalman filter enables the estimate of the state to be updated as new observations become available. In this paper we have tried to derive the Kalman filter as simple as possible.


Introduction
The analysis of a time series [5] and spatial analysis [2] can be done in one of two ways. Firstly, it could be analysed in a batch. In other words, the whole time series to date is analysed. Secondly, analysis of the time series sequentially. With this approach the current estimate of the state of the system is updated over time as new measurements are made. Whilst both methods will give the same answer, the sequential method has two advantages: 1. reduction in the computational cost 2. reduction in the storage capacity.
The disadvantage of sequential methods is appeared when the posterior can not be calculated exactly. In this paper we will concentrate on the state-space approach to modelling dynamic systems. In order to analyze a dynamic system, at least two models are required: • First, a model describing the evolution of the state with time (the state model) • Second, a model relating the noisy measurements to the state (the measurement model).
In the Bayesian approach to dynamic state estimation, one attempts to construct the posterior probability density function of the state based on all available information, including the set of received measurements. For many problems, an estimate is required every time that a measurement is received. In this case, a recursive filter is a convenient solution. A recursive filtering approach means that received data can be processed sequentially rather than as a batch so that it is not necessary to store the complete data set nor to reprocess existing data if a new measurement becomes available. There are classes of problems for which the recoursive Bayesian solutions are tractable. The most important of these classes is the set of problems where the state and observation equations are linear, and the distributions of the prior, and observation and state noise are Gaussian. In this case no algorithm can ever do better than a Kalman filter [4].
In this paper we have tried to derive the Kalman filter properly. If the state space model is linear with uncorrelated Gaussian noise and a Gaussian prior, the prior and posterior distributions at each time step are themselves Gaussian random variables. A Gaussian random variable's distribution is uniquely defined by the specification of its mean and variance. The Kalman filter is a recursive estimator and plays a fundamental role in statistics for filtering, prediction and smoothing. The key element in any recursive estimator is the estimate of the current state , x k , at time k, based on observations up to and including observation k and the Kalman filter enables the estimate of the state to be updated as new observations become available.

Basic model
The basic model in order to derive the Kalman filter is described by linear, discrete-time, finite-dimensional state-space equations Here we shall make the following assumptions: 1. {v k } and {w k } are individually independent, zeromean, Gaussian processes with known variances.

42
Kalman Filter: A Simple Derivation 3. At the initial time, k = 0, the initial state, x 0 ∼ N (x 0 , P 0 ), further we shall assume that x 0 is independent of v k and w k for any k. Hence This state-space model has some properties: 1. Since x k is a linear combination of the jointly Gaussian random variables x 0 , w 0 , w 1 , · · · , w k−1 , it is a Gaussian variable.
2. {x k } is a Markov process.
3. {y k } is a Gaussian process.
We have made these assumptions in order to derive the Kalman filter easily. Some of these assumptions can be relaxed in practice.

Scalar Kalman filter
We have already introduced the basic models (1, 2) to derive the Kalman filter. We want to show that under some assumptions conditional expectation yields a set of recursive equations which is known the Kalman filter recursions. Here we shall denote the following quantities: state , x k , at time k, based on observations up to and including observation k.
is the predictor of the current state , x k , at time k, based on k − 1 observations.
• Σ k|k = var(x k |y 0 , · · · , y k ) is the variance of the current state , x k , at time k, based on observations up to and including observation k.
• Σ k|k−1 = var(x k |y 0 , · · · , y k−1 ) is the variance of the current state , x k , at time k, based on first k − 1 observations.
We might interested in prediction of E(y k |y 0 , · · · , y k−1 ). Similarly notations can be established for observation equation. From (2) and the third assumption it can be written Hence the conditional distribution takes the form x 0 |y 0 ∼ N (x 0|0 , Σ 0|0 ), which is completely specified by following conditional mean and variance: Using state-space equation (1, 2) for k = 0, we get , the conditional distributions are themselves normal and completely specified by conditional means and variances .
where its mean and variance arê Let we have repeated this method and found x k−1 |y 0 , · · · , y k−1 ∼ N (x k−1|k−1 , Σ k−1|k−1 ), using state (1,2) we can update equations by Hence the conditional distributions are (3) So the Kalman filter equations can be written aŝ Notice that the conditional variance is independent of y 0 , · · · , y k . Alternatively, usingx k|k−1 = ax k−1|k−1 the Kalman filter can be written bŷ Note thatx k|k is the estimate of state x k at time k based on the previous estimate and only one data at time k.
No algorithm can ever do better than a Kalman filter in the linear Gaussian state space. It should be noted that it is possible to derive the same results by minimizing MSE.

Kalman filter predictor
We might intrested in E(x k+ℓ |y 0 , · · · , y k ), which is known as the Kalman filter predictor.

Vector Kalman filter
We have dealt so far with the scalar state equation by a first order autoregressive process. We can write directly the vector Kalman filter which each one is the first-order autoregressive process: x jk+1 = a j x jk + w jk j = 1, 2, · · · , q. (7) The q equations (7) can be written as the first-order vector equation, X k+1 = AX k + W k , where A is a q × q matrix, in this case given by A = diag(a 1 , a 2 , · · · , aq) and W k ∼ N (0, Q k ). Assume that observation equations at time k are

We define observation vector equation by
Hence we can directly transfer the scalar Kalman filter, given by equations (3) to (5) in to the corresponding vector Kalman filter form • We can obtain the following vector and matrix set of equation directly from scalar Kalman predictor(6).
which β k is already defined.

Some properties of Kalman filter
1. The Kalman filter defined in the last sections was conditional expectation of E(x k |y 0 , · · · , y k ) which minimizes the mean square error. It is also maximum likelihood estimator and the best linear unbiased estimator whose variance is less than any other linear unbiased estimators. Even we drop normality assumption, E(x k |y 0 , · · · , y k ) also minimizes the variance. Here E(x k |y 0 , · · · , y k ) is a function of y 0 , · · · , y k and without the normality assumption will not necessarily be linear.
2. Let AY +b be a linear estimator of X given Y , where A is a fixed matrix and b is a fixed vector. We define a linear minimum variance estimator, does not require the joint probability density. If X and Y are jointly Gaussian, the minimum variance and linear minimum variance estimators coincide.
3. IfX(Y ) is a linear minimum variance estimate of X, then CX(Y ) + d is a linear minimum variance estimator of CX +d, where C and d are fixed matrix and vector respectively. It is clearly property of linearity of the expectation operator.
Recall that E(X|Y ) is a conditional minimum variance estimator. InfactX(y) evaluated at Y = y is a conditional minimum variance estimate for all y if and only if E(X|Y ) = cy + d for some c, d and this impliesX(y) = E(X|Y ).

5.
Kalman filter can be extended to many nonlinear problems [3]. As most real world models are non linear and non Gaussian and many analysis tasks involve estimating the state of a dynamic model when only partial or inaccurate observations are available, it is of great interest to develop efficient computational methods to solve this so called Bayesian filtering problem numerically. Many approximation methods such as a Extended Kalman filter (EKF) have been developed to cope with this problem. The EKF linearises the state and observation equations, and then uses the Kalman Filter to obtain estimates for the state. For more details see e.g. [1].

Smoothing Kalman filter
We already have consideredx k|k = E(x k |y 0 , · · · , y k ) andx k+ℓ|k = E(x k+ℓ |y 0 , · · · , y k ) as classic Kalman filter and predictor. One might be interested inx j|k = E(x j |y 0 , · · · , y k ), j < k, which is known as smoothing Kalman filter. We expect this estimator to be more accurate than E(x k |y 0 , · · · , y k ), because more measurements are used in producingx j|k . The simplest smoothing is fixed-point smoothing where, we want to determinex j|k = E(x j |y 0 , · · · , y k ) and the associated variance, The augment state space model is therefore with the state vector at k = j satisfying Using the Kalman filter predictor It can be written as Note that Σ k+1|k appearing here is precisely that associated with the scalar Kalman filter equation, because the first element of the augmented Kalman filter isx k|k−1 .
Hence for k ≥ j,