Optimal Unit Commitment with Ant Colony Algorithm

This paper presents an improved ant colony search algorithm that is suitable for solving unit commitment (UC) problems. Ant colony algorithm (ACA) is a meta-heuristic technique for solving hard combinatorial optimization problems. It is a population-based approach that uses exploitation of positive feedback, distributed computation as well as constructive greedy heuristic. The ACA was inspired by the behavior of real ants that are capable of finding the shortest path from food sources to the nest without using visual cues. The constraints used in the solution of the UC problem using this approach are: real power balance, real power operating limits of generating units, spinning reserve, startup cost, and minimum up and down time constraints. The approach determines the units schedule followed by the consideration of unit transition related constraints. The proposed approach is expected to yield a better operational cost for the UC problem of production of 50 power plant units.


Introduction
The unit commitment problem is a hard combinatorial mixed integer optimization problem to determine the optimum schedule of generating units while satisfying a set of system and unit constraints. Finding a good solution to the unit commitment problem in a reasonable time is very critical, since it could mean significant annual financial savings in power generating costs.
Several solution techniques have been applied to solve the problem. These include deterministic, meta-heuristic, and hybrid approaches. Deterministic approaches include the priority list method, dynamic programming, Lagrangian Relaxation (LR). The priority list is the simplest and fastest but achieves poor final solutions. The LR method provides a fast solution but it may suffer from numerical convergence and solution quality problems.
Due to the mixed binary and continuous variable nature, of the UC problem, traditional optimization techniques may miss the optimal solution. In addition, dimensionality of the UC problem limits the application of traditional optimization techniques to small size system. Recently, meta-heuristic approaches became popular in the effort to overcome shortcomings of traditional optimization techniques. Techniques such as genetic algorithms, evolutionary programming, simulated annealing, and tabu search have been widely investigated to solve the UC problem. These methods can accommodate more complicated constraints and are claimed to produce solutions of improved quality.
The ACA is a meta-heuristic technique proposed to find a near optimal solution to the UC problem. The ACA was inspired by the behavior of real ants which are capable of finding the shortest path from food sources to the nest without using visual cues. The technique has been tested successfully on diverse complex combinatorial optimization problems. Also, it has been shown that the ant colony method performs with little variability over problem diversity or random number seed. In addition, the ACA introduces an entirely new solution at each iteration, while other meta-heuristic optimization techniques are based on improvement of the solution or set of solutions obtained from the previous iteration. During the solution process of the UC problem using the ACA, the obtained solution is always feasible. Moreover, the ACA does not require major assumptions and approximations that limit the solution space.
The paper is organized as follows: Section 2 contains the Unit Commitment problem formulation; Section 3 introduces the paradigm of the ACA and includes a discussion on its application to solve the UC problem; Section 4 presents the performance and discussion of running the algorithms to solve the UC problem; and conclusions are presented in Section 5.

Problem Formulation
The objective of the unit commitment problem is to determine the output level of units in each time stage that yield the minimum total cost while satisfying the problem constraints. The UC problem is described as follows: Subject to: , ; (1 1), ; The system and unit capacity related constraints can be handled by solving the economic dispatch problem at each state. After the search space of multi-stage scheduling was determined, the unit transition related constraint and startup cost will be considered during the process of state transition by the ant colony search algorithm.

Ant Colony Behavior
The ant colony algorithm (ACA) mimics the behavior of real ants . It has been found suitable for solving UC problem that belong to a class of hard combinatorial optimization problems. The optimum paths followed by artificial ants (agents) are determined by their movements in a discrete time domain. While moving, agents lay some pheromone along their paths. Also, the agents are memory less. That is, regardless of the information acquired from their previous paths, the agents' decision to move from the present state r to the next state s is based on two measures. These are the length of the path which connects the present state to the next one, and the desirability measure (pheromone level). In brief, the paradigm of the ACA is such that each agent generates a completed path (tour), by choosing the next states to move to according to a probabilistic state transition rule. This rule reflects the preference of agents to move on shorter paths that connect the current state to the next state and are also endowed with high level of pheromone.
In addition, while constructing its tour, an agent modifies the level of pheromone on the traversed paths by applying the local updating rule. The pheromone level on the traversed paths is lowered (evaporated) so that these paths become less attractive to other agents. This property gives agents a higher probability to explore different paths and find an improved solution. In other word, the role of the local updating rule is to make the desirability of paths change dynamically. This way, the agents can avoid being trapped in a local search based on the best previous tour.
Once all agents have reached the final state and have identified the best-tour-so-far based on the value of the objective function, they update the pheromone level on the paths that belong to the best tour by applying a global pheromone updating rule. This is intended to allocate a higher level of pheromone on the paths which belong to the best-tour-so-far. This rule is similar to a reinforcement learning scheme in which better solutions get a higher reinforcement. The rules that apply to agents while trying to find the best path solution are presented below.

State Transition Rule
Each agent builds a completed path to the end state, which occurs at the final stage. This takes place through the recurring application of the state transition rule. This rule calls for an agent located on state r at the current stage to move to state s at the next stage along a shorter path with a high amount of pheromone rs. This is achieved by a state transition rule that utilizes both the inverse of the length of the path rs and the amount of pheromone τrs.
In the UC problem, the transition cost from the present to the next hour TCrs, simulates the length of the path taken by an agent to reach the next state s from the current state r.
with θ being a parameter which determines the relative importance of pheromone versus path length (θ > 0).
The parameter 0 q determines the relative importance of choosing the more preferable (exploitation) versus less desirable (exploration) paths. Every time an agent in a current state r has to choose a next state s, it samples a random number 0 1 q < < . According to (1), if 0 q q ≤ , the best path is an exploitation one, otherwise an exploration path is chosen according to (1) and (2).

Local Updating
During the establishment of its path, an agent changes the pheromone level on the traversed path (local updating) by applying the local updating rule (3). This rule has the effect of lowering the pheromone level (evaporation) on the traversed paths so that these become less attractive to other agents.
. Using the sum of the initial minimum transition costs to estimate J and keeping it throughout the solution process was found to be a good approach.

Global Updating
After all the agents have built their individual solutions, a global pheromone updating rule is applied only to the path that labeled the best-tour-so-far. It allocates a higher pheromone level to this tour. This is equivalent to a reinforcement learning scheme in which better solutions get a higher reinforcement. Global pheromone updating is performed by applying (9) (1 ).
.  Figure 1 illustrates the computational flow of the ACA to solve the unit commitment problem. The main computational processes for the traditional ACA are discussed as follows

Ant Colony Algorithm
Step 1. Initialization i. Setup the parameters (q0, θ, α, ..., etc) and capture the system data (load, reserve, minimum up and down time, and different cost components).
ii. Build the unit status table.
Step 2. Search space definition and economic dispatch i. Establishing the multi-stage search space. All the possible permutations form this search space.
ii. Economic dispatch Step 3. The ACA cycle i.
Place the agents at the initial state. ii.
Select the next state that each agent will move to by applying the state transition rules (7) and (8). iii.
Update the local pheromone levels using (9). iv.
Update the cost functions for each path considering the transition cost. v.
When all agents have reached the final stage (i.e., each has generated a completed path or tour), apply the global updating rule (10) to the best-tour-so-far. vi.
One full cycle is now completed. The process ends if no or minor improvement in the objective function value is observed; otherwise steps i-vi are repeated.
In the ACA as described above, all agents are moving together from the initial stage to the final stage. The pheromone level is updated locally after all agents move from one stage to the other. In the proposed ACA, one agent moves at a time from the initial to the final stage. During the movement of each agent the pheromone level is updated locally. Moving one agent at a time will give all the other agents the opportunity to exploit the experience of the previous agents. For example, in the first cycle of the solution of unit commitment problems, all agents start with same initial pheromone level (in this problem this value is taken 0.01). The first agent will move from the initial stage to the final stage. Movement decision is based on the initial pheromone level and the transition cost. During the movement of the first agent the pheromone level for the selected path by the agent will be updated. Now, when the second agent moves from the initial to the final stage, it uses the information (pheromone level) Left by the previous agent as part of decision making process. The process repeated. For example, the third agent will use the information left from the previous two agents.

Results
In this section we present the results. We have attempted to compare the results in different states to obtain a better view about solving the problem.

Production Cost
Following running the program, production planning is implemented for a time period of 24 hours through ant colony algorithm. In general, production rate of each unit, data of unit hourly, which should be placed into the power system and production cost, are expected to be the optimum output of an exploitation planning. Table 1 demonstrates the hourly production and its cost. As it is shown, cost has a direct relationship with production so with the production increase the cost also increase. However, production increase is not the sole factor through the exploitation planning process, and exploitation constraints and power plants constraint will also cause cost raise. The total exploitation cost of units in 24 hours equals 6328800 dollars. This cost includes the transient and fixed cost of power plants and the extra cost imposed by the constraints to the system.

The Effect of Number of Agents (ants)
In ants colony algorithm, each ant as an agent searches for different states and figures out the optimum results. There is no special method for determining the number of agents and this number is identified experimentally. It may seem that by increasing the number of agents and consequently the number of iterations the result will be more rewarding, but in this section we will see this is not the case. In this section within two scenarios the production cost for the first hour has been calculated, one in which the number of agents is 25 being under 50 agents for the main problem solution and the other in which the number of agents is 100 , being over the number of problem solving agents. Twenty five agents have been utilized to search the problem space. As shown in table 2, the best answer lies in the 4 th iteration and equals 141016.1 MW ,which is higher by 2.97% compared to the best problem solving result with 50 agents (136822.7 MW), showing the error of 2.97% regarding a more accurate answer to the problem . Thus, if the number of agents is under usual rate, the possibility of error incidence will increase.

The Problem of Local Optimization and Lack of Convergence
The use of meta-heuristic methods to solve a problem increases the probability of lack of convergence and trapping in local optimization. As stated, one of the methods to prevent this problem is to cause excitation in search space of the agents. In this project this method is used to solve the problem of local optimization and lack of convergence of algorithm. In most cases, of course, after frequent iterations convergence acquired, However, in this case the answer is either local optimization or a long period of time is required to receive the answer due to the iteration increment. Figure 2 indicates an example of the increase in the number of iterations due to the search space limitation. In this chart, total units production power is plotted in terms of different iterations. It can be observed that following 3081 iterations optimum answer has been obtained. As demonstrated in the figure, the red circles indicate trapping in non-convergent iterations in which the number of iterations reaches 1000, causing elongation of production planning process.
To solve this problem and to shorten the required time for obtaining the optimum answer, a constraint is added to the ant's colony algorithm. This constraint will lead to exciting in search space in case the number of iterations exceeds 70. As it is shown in figure 3 from the beginning to the seventieth iteration, algorithm is situated in non-convergent iterations loop (the part shown with red circle), However, with exciting from the seventieth iteration on, the algorithm is removed from the non-convergent loops trap, and the number of iterations compared to the alike previous state is reduced to 143, and the problem of extra iterations will no longer exist and a greater space will be achieved. This will reduce the calculation time which is quite essential in real time planning.

Conclusions
This paper deals with the unit commitment problem using ant colony algorithm. To solve this problem two scenarios were taken into account in the first place. The first scenario was choosing 25 agents (ants) and the second scenario was choosing 100 agents. The results of the two scenarios were compared with each other, and ultimately 50 agents was determined which seems a more reasonable number and the results cover a wide range of search space. The results obtained from ant's colony algorithm to solve the problem of unit commitment indicates the efficiency of this algorithm in solving the problem.