Multiobjective Tabu Search for Clustering and the Zoning Problem

Arnaldo P. Castaño

5.00/5 (2 votes)

18 Nov 2013CPOL10 min read

10.7K

This document describes a tabu Search algorithm embedded in a multiobjective framework capable of finding solutions to the zoning problem by allowing the optimization of multiple objectives.

Abstract: This document describes a tabu Search algorithm embedded in a multiobjective framework capable of finding solutions to the zoning problem by allowing the optimization of multiple objectives. The compactness meaning the cluster given to the set of objects (territorial units for a zoning problem) represents one of those objectives to be optimize, the other, homogeneity meaning a balance in the distribution of several variables that each object in the problem possesses and are selected as input for the algorithm, both objectives are being minimize.

I. Introduction

Zoning is a technique that belongs to the area of urban studies; it appeared for the first time in the XIX century to separate the residential areas from the industrials. The main idea with this technique, the most popular in urbanization, is to produce a partition of homogeneous regions according to some criteria. In our case each variable matches a criteria and each basic geostatical area possesses a collection of values each representing the value for some variable. These variables could be demographic, for instance people who are older than twenty.

A basic geostatical area (BGA) is the manner in which will refer to a basic or primitive region to be clustered. Any BGA consist of a pair (position, variables_values) where position marks the location of the area in space (usually two coordinates) and variables_values represent a list of values for every variable in the problem.

Tabu Search is a metaheuristic created by Fred Glover that inherits from local search a popular optimization method. It’s common to see it applied to combinatory optimization in problems such as TSP (Travelling Salesman Problem) or QAP (Quadratic Assignment Problem). The use of metaheuristics to solve the zoning problem, as well as the TSP is mandatory, because of their NP Complete nature. In fact most of the time, we don’t find optimal solutions, but approximations to these optimal solutions and sometimes if we are lucky, this approximations might equaled some optimal solution.

Our tabu search is embedded in a multiobjective framework; this implies that several objective functions are being optimized, all at once. Because this is a multiobjective algorithm its main goal is to find non dominated solutions. Any unknown definition we’ve seen so far will be covered in the next section.

II. Preliminary and Definitions

In this section we define theoretical aspects that we considered relevant and necessary for understanding the contents of this paper.

Many optimization problems often involve multiple objective functions such problems are known as Multiobjective Optimization Problem (MOP). It can be stated as follows:

Where represents the feasible space, that is the set of all valid solutions, the solutions that fulfill every constraint of the problem.

A vector is said to dominated another vector, denoted if and only if .

Having multiple objectives denotes an important issue. The improvement of one objective function could lead to the deterioration of another. Thus a solution that will optimize every objective rarely exists, so instead of looking for that solution a trade-off is searched. Pareto optimal solutions represent this trade-off. A feasible solution is said to be Pareto optimal iff such that. The set of all Pareto optimal solutions is known as the Pareto set and its image the Pareto frontier. The algorithm proposed in this paper finds an approximation to a set of Pareto optimal solutions forming a Pareto frontier. The next section describes the Tabu Search with all of its components.

Dissimilarity matrixes

In this work we considered using two dissimilarity matrixes, one for the coordinates of each region and another for the value of every variable in each region.

III. tabu Search

Tabu Search (TS) is a metaheuristic presented by Glover which uses adaptive memory and responsive exploration. Inherits from local search (LS); probably the oldest and simplest metaheuristic method ever. It could be said that Tabu Search is just a local search with some considerable improvements or upgrades. Its core functionality is the same as LS; starts at a given initial solution (usually generated randomly), it runs until a stopping rule is reached and each iteration replaces the current solution by another which improves the objective function, and is found in the neighborhood of the current solution. The stopping rule for LS is reached when no neighbor of the current solution improves the objective function meaning we have a local optimum. This the main disadvantage for LS; it only finds local optimum, a disadvantage Tabu Search does not shared as it includes mechanism for diversification which prevents it from getting stuck into a local optimum.

Encoding and neighborhood

A solution is encoded as a pair (elements, centers), first an array of length , where is the number of BGAs, the number of clusters and indicates that object (region in our case) is located at cluster. The other array centers of length, contains every center. The neighborhood of a given solution denoted is obtained by swapping all pairs of elements, where, is any center and any element, so having as a solution implies.

Adaptive Memory

This is probably the most important characteristic of Tabu Search: its capacity to remember the evolution of the search, accomplished through the use of data structures. Tabu list, represents one of these data structures, used to save pairs previously swapped, avoiding the possibility of cycling around the same solutions, at least for some time (the length of the list must be finite because memory is finite). The term intensification refers to a mechanism that many metaheuristic implement to favor the exploitation of the best solutions found so far, in this case the more promising regions are explored more thoroughly, diversification on the other hand refers to exploration of the search space, trying to visit unexplored solutions. In TS these mechanisms are expressed in the medium-term memory and in the long-term memory or freq memory. Intensification is managed by the medium-term memory, storing the best solutions ever to be found. Diversification is handled by the freq-memory showing how many times a given element has been included in a solution as center.

Our algorithm uses freq-memory to favor objects (regions) with the lowest frequency as centers in a diversification phase.

Solving the optimal zoning problem

Tabu Search is embedded into a multiobjective framework, so in order to solve the problem we have to take into account several objective functions. The first of these functions minimizes the intra-class distance that is, the distance of every object to the center of its class or cluster.

In this formula represents the Euclidean distance, represents a center and the set of elements that belong to a cluster with center.

The second objective considers the homogeneity of a solution. Homogeneity means that elements belonging to the same cluster shared some characteristics, in this case the value of some variables. When we partition under the criteria of homogeneity the idea is to find a balance or equilibrium in every cluster for each variable, so the actual function to be optimize is the balance of homogeneity. To find the balance of a group with respect to some variable, we count the number of elements having the desired value for that variable and subtract that amount from the ideal value; which occurs when all of the elements in the group have the desired value. The sum of every group balance represents the balance of homogeneity to be minimized. For each variable we’d like to homogeneity a balance of homogeneity function should be added to the optimization model, so if we want to homogeneity three variables, our model will have four objectives, the intra-class function and one balance of homogeneity function for every variable.

Where represents the number of elements with the correct value on the variable being homogeneity in group and represents the ideal value.

Now that we stated the optimization model is time to described TS’s strategy for finding non dominated solutions and building a Pareto Frontier.

The strategy is divided into three different stages.

A clustering of the regions is found (we optimize only the intra-class objective).
Once we have a clustering its homogeneity is calculated.
We check to see whether this solution is nondominated, if it’s, we added using PSET (we’ll see what this means shortly) otherwise we do nothing.

Seeing the strategy in these steps makes it look very simple. We separate the intra-class optimization from the homogeneity optimization and then verify the nondominated nature of the new solutions.

PSET

PSET (Pareto Set) is the component of the algorithm that takes care of building the Pareto Set. Once a solution has been found PSET will verify if this solution is "suitable" to be consider in the set of solutions. Suitable means that is not duplicated (already in PSET) and nondominated. If the solution happens to be nondominated then we might need to remove some old solutions in PSET that no longer classify as Pareto optimal because they are now dominated by the incoming solution. PSET has a tolerance rate (set to 0 by default) as a parameter that could allow the inclusion of solutions in PSET that are actually dominated, but very close to some nondominated solution (their difference is less than or equal to ).

Stopping Rule

The stopping rule, as the name suggests, defines the moment or iteration when the algorithm should stop. In our case we’ve fixed a number of iterations and if we reached that amount of iterations the algorithm will stop. However there’s another condition for stopping the algorithm and that is when new nondominated solutions stopped coming also during a fixed number of iterations.

Algorithm

This is a skeleton for the TS algorithm:

Randomly generate an initial solution.
Generate the neighborhood of the current solution.
If there is a solution that improves the current solution then select it as the new current solution. If not then enter into a diversification phase
Get the new solutions to PSET for verification.
Go to 3 until a stopping rule is reached.

IV. Results

In order to test the algorithm, a real-world problem has been used. It’s illustrated as follows: the BGAs of the metropolitan area of Toluca Valley are going to be clustered into five homogeneous groups that only include elements whose variables have values in the ranges indicated below.

Male Population under 6 years (X001).
Male population between 6 and 11 years (X003).
Male population between 15 and 17 (X007).

The homogeneity will be obtained in all three variables.

Tabu Search has run several iterations with this example getting the following results:

Table 1 - Toluca Valley territorial design

Toluca Valley results for homogeneity and compactness, iterations=15, regions=5.

We can appreciate from table 1 that the results were rather satisfactory. The Pareto Frontier of the previous solutions (Pareto Set) is represented in the next graphic.

V. Comparisons

The comparisons were made against a VNS (Variable Neighborhood Search) algorithm associated with the same problem. As the following results showed, Tabu Search is superior in a lot of aspects (homogeneity, compactness, number of solutions, etc.)

Conclusions

Multiobjective problems arise in many real world problems; this paper has focus on one of those problems namely the zoning problem. Tabu Search proved to be an efficient method for solving this problem by giving some interesting results in the experiments.

Among the different metaheuristics you might find today we selected Tabu Search because of its extraordinary condition as a method of intelligent search. We hope that future work on the zoning problem would result in new methods, ideas and comparisons with our TS.

References

[1] Bernábe Loranca B., Coello Coello A.C., Osorio Lama M.,“A multiobjective approach for the heuristic optimization of compactness and homogeneity in the optimal zoning”.

[2] Beausoleil P. Ricardo, “Multiobjective clustering using Tabu Search”, Institute of Cybernetics, Mathematics and Physics.

[3] Talbi El-Ghazali, Metaheuristics: from design to implementation. New Jersey: Wiley and Sons, 2009, ch. 2.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)