Optimal Control Approach to Image Registration

We discuss the orthogonality problem of moving grids, where we minimize a cost function with a regularization term and improve the orthogonality of grids by making the angle between grids close to 90 .The initial grid used is obtained by a well-established method known as the grid deformation method. We will then replace the cost function in the orthogonality problem with sum of squared differences (SSD) to discuss the image registration problem. We will discuss the non-uniqueness of solutions, existence of optimal solutions and prove the existence of Lagrange multipliers of the image registration problem using the Direct Method in Calculus of Variation and then derive an optimality system based on the construction of a Lagrangian functional from which optimal transformations can be calculated.


Introduction
In this paper we use the moving grid method. The moving grid method forms the foundation for the optimization of grid orthogonality as well as the image registration problem.

Grid Deformation Method
The grid deformation method is used for the construction of differentiable and invertible transformations to solve mesh adaptation problems.
To solve a differential equation numerically by finite difference, finite element or finite volume methods, a good grid is needed for discretization of the physical domain.
The idea of this method is to move nodes with correct velocities so that the nodal mapping has a desirable determinant.
There are numerous approaches of grid generation as discussed in [8].In this paper we use the moving grid method.
Harmonic maps are widely used to generate a body-fitted structured grid on a general domain. In 2D, due to the Rado's Theorem which informally states that any "nice looking" shape without holes can be smoothly deformed into a disk.
Theorem 1.1 [Rado's Theorem]: Suppose Ω is an open, connected and convex subset of the Euclidean space R 2 with smooth boundary ∂Ω and let D denote the unit disk. Then, given any homeomorphism μ: ∂ D → ∂ Ω, there exists a unique harmonic function u: D → Ω such that u = μ on ∂D and u is a diffeomorphism.
The 3D version of Rado's theorem is false .The grids may fold in theory (and in practice).
The not so stable version of the grid deformation method was developed in [1], [2], [3]; it was improved in [4] and used with a finite-volume solver in flow calculations in [5]. A 2D version of the method was proposed in [6] and used with a discontinuous Galerkin finite-element method in solving a convection diffusion problem in [7].
The grid deformation methods origin is based on a deformation scheme used by J. Moser [8] in differential geometry. The following result is taken from B. Dacorogna and J. Moser [9]:

Image Registration
An image is a 2 dimensional array of pixels with assigned brightness values. A pixel is the smallest item of information in an image. Each pixel is a portion of the original image. With more portions we get a better representation of the image.
Mathematically in [12], an image is defined as a d-dimensional function Further details can be found in [12]. Image registration is the process of overlaying two images of the same scene taken at different times, from different viewpoints; or images of different, but related scenes. It could also be from the same or different imaging modalities.
Some popular imaging modalities are x-ray, computed 136 Optimal Control Approach to Image Registration axial tomography (CAT), magnetic resonance imaging (MRI), positron emission tomography (PET), functional MRI (fMRI), ultrasound (US). Image registration is the process of finding an optimal geometric transformation between corresponding imaging data. In practice, the concrete type of the geometric transformation as well as the notions of optimal and corresponding depends on the specific application.
The image registration problem can be phrased in only a few words: given a Reference R and Template image T, , ( Notice that the reference image has undergone a slight rotation. Our goal is to get the Template image to be as similar as possible to the Reference image by means of a transformation. The two images in Figure  Image registration is a problem often encountered in many applications areas like, for example, astro-and geophysics, computer vision and medicine. For an overview, see, [13], [14], [15] and [16], Brown (1992), Maintz & Viergever (1998), Maurer & Fitzpatrick (1993), van den Elsen et al (1993).
Image registration is an ill-posed problem, small changes to the input images can lead to completely different registration results.
As a result of this solutions are not unique. For example, Suppose we want to register the reference and template images above for simplicity reasons allowing only rigid transformation i.e. rotations and translations.  We find several solutions namely a pure translation, a rotation of 180°, a rotation of 90° followed by a translation and so on [12].
We can divide the applications of image registration into four main groups according to the manner of the image acquisition as follows: (1) Different viewpoints (multiview analysis): The goal is to gain a larger 2D view or a 3D representation of the scanned scene (2) Different times (multitemporal analysis): Motion tracking, medical imaging monitoring of the healing therapy, monitoring of the tumor evolution (3) Computer vision shape (4) Different sensors (multimodal analysis) Details about Computer vision shape and Different sensors can be found in [17].
In regards to medical images, registration can be achieved as mentioned earlier by using different imaging modalities based on specific needs. Using the same modality for a patient, monitoring and qualifying disease progress over time can be done. If tissue analysis is required, deformation monitoring can be done. Now, if different modalities are used with a patient, correction for different patient position between scans and link between structural and functional images can be done.

Optimization of Grid Orthogonality
In this chapter we will discuss about optimizing grid orthogonality by minimizing a cost function involving the orthogonal functional and penalty term. The goal is to keep the size distribution of the grids.
We will discuss what has been done before and the significant improvement we have made in this thesis. Analysis of the problem will be shown and numerical results based on the analysis will be presented in this thesis.
Let us first discuss briefly about numerical generation of grids.

Numerical Grid Generation
Numerical grid generation has now become a fairly common tool for use in the numerical solution of partial differential equations on arbitrarily shaped regions.
Numerical grid generation can be thought of as a procedure for the orderly distribution of observers, or sampling stations over a physical field in such a way that efficient communication among the observers is possible and that all physical phenomena in the entire continuous field may be represented with sufficient accuracy by this finite collection of observations. Another way to think of the grid is as the structure on which the numerical solution is built.
The numerical solution of partial differential equations requires some discretization of the field into a collection of points or elemental volumes.
The differential equations are approximated by a set of algebraic equations on this collection, and this system of algebraic equations is then solved to produce a set of discrete values which approximates the solution of the partial differential system over the field. General methods of grid generation as well as what is discussed above can be found in [18].
Note: Algorithms have been produced lately to solve ODEs and PDEs by moving grids, the latest papers which show its effectiveness can be read in [19], [20], [21], [22], [23], [24], [32] and [33]. These references provide the technique that show the moving mesh algorithm and the discrete problem can be patched up to find the solution on adaptive mesh. In addition these papers also provide methods where the problems are nonlinear and Jacobian needs to be used to solve these problems. These adaptive procedures are efficient and do not lead to ill conditioned matrix as low order polynomial interpolations are sufficient 2.1.1. Variational Method of Improving Grid Generation Properties of grids such as length of grid lines or smoothness, area or volume of cells, and orthogonality of grid lines are controlled by the minimization of a functional [25].
Brackbill and Saltzman in [26] combined these functionals into one functional, which is given as: are non-negative numbers. It is shown by calculation in [26] that numerical solutions which minimize I is obtained for finite values of In [27], the functionals are: In [28], it was shown that the volume is the most important property to control, followed by smoothness and orthogonality.
When the volume control is given more weight than smoothness, with a little orthogonality control, this technique seems to work best. It was observed that none of the grids that were generated with a significant amount of volume folded. The concern of grid folding in We enforce the Jacobian determinant of the grid generated to be strictly positive through the control of a monitor function f. 138 Optimal Control Approach to Image Registration A simpler model was studied in [25]. This model involved only a weighted sum of the smoothness control functional, S I and the volume (area) control functional, The existence of a minimum of I in a sobolev space is proven by the direct method in calculus of variations. In [25], an explanation is given as to why V I is used to control cell volume.
Since in the theory of calculus of variation, convexity is required, a weak sensed convexity is proven i.e. poly convexity.
This guarantees that the Euler-Lagrange equation is elliptic in the sense of Agmon, Douglis and Nirenberg which is also referred to as the Legendre-Hadamard condition.
J.U. Brackbill et al in [26] quoted [29] in that their v The functional discussed is a linear combination of the length functional, the volume functional and a higher order term.
The inclusion of the latter guarantees the existence and regularity of a minimum via the direct method for calculus of variations.
In this paper, we minimize I with S I , the smoothness functional without the volume or area functional, V we minimize the orthogonal functional with a penalty term to make the problem more elliptic.
We then find the Euler-Lagrange equations and solve for u numerically, which gives us the orthogonal grids.  Let us find the Euler-Lagrange equations that the minimizing function must satisfy.
Let us write the functional as: From the divergence theorem, we can find the Euler-Lagrange equations.
We let: Can be rewritten as: δϕ is arbitrary then Apply the divergence theorem, we get Similarly, we have The Euler-Lagrange equations are: is generated by the deformation method version 3 discussed in chapter 1.
There is a control about the poisson solver: bdfactor. If bdfactor = 0, the computation is not conducted on the boundary but if the bdfactor = 1, the computation is conducted on the boundary.
Tolerance is needed to help satisfy (2. Two kinds of boundaries are used: the fixed boundary and the mirror boundary, because we don't know the correct boundary.

Dirichlet Boundary Condition
Complete computation on boundary is conducted, using the one-sided difference scheme.

Neumann Boundary Condition
Imposes the imaginary grid points o ϕ and the variation u near the boundary according to symmetric relation.

Numerical Experimentation
This chapter presents several numerical results showing the improvement in orthogonality of the grid by first producing an initial grid 0 ϕ by the deformation method and then composing with u, which is found by solving the Euler-Lagrange Equations, and then the composition u + 0 ϕ produces the orthogonal grid.
The initial grid 0 ϕ is a 25 25 × grid.   Orthogonal grid (b) for λ =1 CASE 2: The controlling parameters are as follows:  Orthogonal grid (b) for λ =0.9 CASE 3: The controlling parameters are as follows:    Orthogonal grid (b) for λ =0.7 CASE 5: The controlling parameters are as follows:          Orthogonal grid (b) for λ =100
In case 2 with λ =0.9, the computation converges after Larger λ such as λ = 600 and λ = 900 were run but the orthogonality result is no better than when λ = 50, so they are not included in this thesis.
Computation converges after 17 iterations for higher λ . The smaller λ is and the larger µ is the more orthogonal the grid. µ is between 0-1, if not distortion occurs.
The boundary type as evident from results is the Neumann boundary condition. This gave good results.
With this method we don't have to worry about grid folding as some techniques had because we have control of the cell size through the monitor function, f. We enforce the Jacobian determinant of the grid generated to be strictly positive through the control of a monitor function f. Unlike what has been done before we don't have to add the volume functional, v I to generate good results.

Transformation Models
Transformation models serve for two purposes. They control how image features can be moved relative to one another to improve the image similarity, secondly they interpolate between those features.

Rigid Image Registration
Rigid registration (Affine) is composed solely of a global rotation, translation, scaling and projection.
Translation: moving image from one position to another position Rotation: changing the angle of image Scaling: increasing the size of the actual image Projection: representing the image on a plane as it would look from a particular direction In Rigid Image registration angles and distances between points are preserved. Rigid transformations are global and linear; hence they can be represented by matrices.
As an example let us consider a 2D scaling from the origin. 146 Optimal Control Approach to Image Registration

Non-Rigid Image Registration
Non-rigid image registration refers to a class of methods where the images to be registered have non-linear geometric differences.
A non-rigid transformation does not preserve the straightness of lines and in general, maps a line into a curve.
In this paper, our problem concerns only the Non-Rigid Image registration. Some non-rigid image registration techniques are viscous fluid algorithm, optical flow methods, thin-plate spline and cubic B-spline as discussed in [17].

Similarity Measures
Similarity metric is divided into two major components as intensity-based and landmark (geometry) based similarity metric.

Intensity-based Registration
A straightforward approach is based on the minimization of the so-called sum of squared differences (SSD); cf., e.g., [13] or [34].
we also define: In this dissertation we will concentrate on minimizing SSD.

Mutual Information-based Registration (MI)
Since 1995, mutual information has been used in image registration. This measure has its roots in information theory and has demonstrated its power and robustness for use in multi-modality registration.
It was proposed independently by Viola in [35] and Collignon et al in [36] and has been used since by Kim et al in [37] and many others.
The basic idea is the maximization of the so-called mutual information of the images with respect to the transformation. Mutual information is an entropy-based measure; it measures a statistical dependence between the intensity of corresponding voxels as opposed to a functional dependency. This method and others are discussed further in [17].

Optimization Methods
Finding the minimum of dissimilarity measure or the maximum of similarity measure is a multidimensional optimization problem, where the number of dimensions corresponds to the degrees of freedom of the expected geometrical transformation.
The optimization problem for nonlinear registration is ill-posed for reasons which will be discussed in future chapters; we add regularization terms or penalty terms next to the dissimilarity measure term to be minimized, which interconnects the transformation and data to be transformed [38].
These two terms together form the cost function (energy) associated with the registration and the aim of the optimization methods is to minimize it.
Let us define the registration problem as: Find a transformation ( ) Here D is a distance measure; S is a regularizing term (smoother) for the displacement u and ( ) x Φ denotes the non-rigid transformation which equates to a translation of every pixel x in the template image by a certain displacement defined by the displacement field ( ) We use parameters to control the strength of the smoothness of the displacement versus the similarity of the images. Regularization terms are added often to handle folding, cracks or other unwanted deformations due to arbitrary transformations.
Typical regularizers are fluid, elastic, diffusive and curvature smoother. The shortcomings of adding these terms are discussed in [17].

Optimal Control Approach of an Image Registration Problem
In this chapter we will discuss the optimal control approach to non-rigid image registration with the intensity based method {Sum of Squared Differences (SSD)} using the grid deformation method and then define the optimal control problem and then discuss about proving the existence of optimal solutions, existence of Lagrange multipliers without the use of an ODE constraint, which was used in [39].
An ODE constraint is used in [39] to find Φ , the time-dependent mapping but Chih-yao hsieh in [40] showed that we don't need that extra constraint.
In [39], an existence of solution of the ODE constraint is proved using the Peano existence theorem and two well-known results in functional analysis. This helps in proving that an optimal solution exists.

Grid Deformation Method
In order to find an optimal transformation that minimizes the dissimilarity between the transformed image and the Reference image, we adopt the grid deformation method [41], which is essential in constructing differentiable and invertible transformations to solve mesh adaptation problems.
The deformation method has its origin in differential geometry [42]. It was reformulated for grid generation as discussed in chapter 1.
The grid deformation method gives direct control over the cell size of the adaptive grid and determines the node velocities directly. A great advantage of this method is that it avoids grid folding by enforcing the Jacobian determinant of the grid generated to be strictly positive through the control of a monitor function f. So, unwanted registration results are avoided.

Formulation
A transformation is defined in a two-step manner. Firstly, a given function is used to construct a vector field that satisfies a div-curl system and secondly, this vector field is used to generate a transformation that moves the grid. Now, let us simplify the grid deformation method by letting 1 , ([39]), for our image registration problem.

Ω → Ω
We assume that S~ Id. Define: T(x) = R(S(x)) i.e. we use S to define the template image.
We register T(x) to R(x) by According to the Helmholtz Decomposition Theorem, a vector field (u in our case) can be constructed with both a specified divergence and curl as above.
The theorem below will be used in the proof for the existence of optimal solutions.
We then seek controls f and g and state Φ such that According to [39] using a weaker 2 L -norm penalization, existence of optimal solutions has not yet been proven and computational studies indicate that 2 Lnorm penalization may not be sufficient to guarantee the existence of optimal solutions. Let us define the product Hilbert Spaces:  Then the optimal control problem is given by:

Existence of Optimal Solutions
We next show the existence of solutions of the optimal control problem (4. 2.1.5 Also, the div-curl system described in (4.2.1.1) is elliptic in the sense of Petrovski, with this we can also obtain a uniform bound for u.  The boundary condition satisfies the complementing conditions of Agmon, Douglis and Nirenberg in [45]. i.e.
( ) As a consequence of the div-curl system being regular elliptic, we have that there exists a constant C > 0 such that: 2 Ω ∈ H u that satisfies the above approximation.
As a consequence, we deduce a uniform bound for u.

Non-Uniqueness of Solutions
Since the image registration problem is ill-posed solutions are not unique. This was discussed in Chapter 1; also see Figure 1.2 for illustration.

Existence of Lagrange Multipliers
Now we want to use the Lagrange multipliers rule to turn the constrained minimization problem in (4.2.1.1) into an unconstrained problem and then to derive an optimality system. So, we show the existence of proper Lagrange multipliers for any ( ) (1).
The proof can be found in [43] We will fit our optimization problem into this abstract framework.
The following lemma is useful in proving the existence of Lagrange multipliers.
To show the existence of Lagrange multipliers, we first prove that the operator , , ' is onto a certain space.
In the following lemma, we prove ' M in a strong formulation is onto. Lemma 4.5.1: Let (u, f, g) be a solution to the optimal control problem given in (4.2.1.5).Then the operator , , We have found a ( ) g f u, , satisfying (4.2.1.7) Now, we will prove that a non-zero Lagrange multiplier exists by following Theorem 4.5. 1 We construct the Lagrangian functional as follows: And then taking the derivative we have: The above is used in Theorem 4.5.2 below to prove the existence of multipliers, still following Theorem 4.5.1 denote an optimal solution to the optimal control problem previously discussed. Then ∃ a non-zero Lagrange multiplier * ∈ Θ W such that: is an optimal solution to (4.2.1.5).
To conclude, we use the Hahn-Banach Theorem. Since, Therefore, . 0 ≠ a without loss of generality we can set 1 − = a and therefore the theorem holds. So, the penalized Lagrangian functional is:

Optimality System
Solution of the Lagrangian functional L above is called the optimality system which consists of state equations, costate equations and the optimality conditions.
First we include a Lemma that we will use.    Details of how to solve these numerically by multi-grid optimization were discussed in [17].It is shown how we can solve these equations.

Conclusions
In this paper we have shown that orthogonal grids can be obtained with a regularization term smoothness functional which has a unique solution. We do this without the volume or area functional. The smoothness functional is added because the orthogonality problem is ill-posed.
The only term we add is a penalty term {smoothness functional} that regularizes the functional.
We optimize the orthogonality of grid by minimizing an orthogonal functional, this procedure involves solving for a displacement u which minimizes the functional I and satisfies the Euler-Lagrange equations. 154 Optimal Control Approach to Image Registration The concern of grid folding is prevented by the grid deformation method as discussed previously and good results were obtained for small choices of λ and large µ .
The orthogonal grids should give more accurate solutions to Partial Differential equations.