Introduction
This article describes how to solve a logic problem using a Genetic Algorithm. It assumes no prior knowledge of GAs. In fact, half of this article is dedicated to explaining the internal structure of a Genetic Algorithm.
So what is the problem domain we are trying to solve?
Well, GAs can be used to solve many problems. In fact, GAs have been used to grow new mathematical syntax trees, train multi-layer neural networks, to name but a few instances.
However, for this example, I have used a simple card splitting excercise, which is as detailed here:
- You have 10 cards numbered 1 to 10
- You have to divide them into two piles so that:
- The sum of the first pile is as close as possible to 36.
- And the product of all in the second pile is as close as possible to 360.
Now, I am not saying that this could not be done by hand, using old fashioned brain juice, it's just better suited to a GA, as it could take 100s or even 1000s of different combinations to get the correct result. Well, probably not that many for this simple problem, but it certainly could take a lot of combinations for a more difficult problem. Suffice to say, it is just good fun to do it with a GA. So, let's carry on.
So what is a Genetic Algorithm?
Well, Wikipedia says this:
A genetic algorithm is a search technique used in computing, to find true or approximate solutions to optimization and search problems, and is often abbreviated as GA. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).
Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genotype or the genome) of candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem evolves towards better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals, and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly mutated) to form a new population. The new population is then used in the next iteration of the algorithm.
Follow that?? If not, let's try a diagram. (Note that this is a Microbial GA, there are lots of GA types, but I just happen to like this one, and it's the one this article uses.)
I prefer to think of a GA as a way of really quickly (well, may be quite slow, depending on the problem) trying out some evolutionary programming techniques, that mother nature has always had.
So how does this translate into an algorithm (this article uses a Microbial GA, but there are many other varieties)?
The basic operation of the Microbial GA training is as follows:
- Pick two genotypes at random
- Compare Scores (Fitness) to come up with a Winner and Loser
- Go along genotype, at each locus (Point)
That is:
That's it. That is the complete algorithm.
But there are some essential issues to be aware of, when playing with GAs:
- The genotype will be different for a different problem domain
- The Fitness function will be different for a different problem domain
These two items must be developed again, whenever a new problem is specified.
For example, if we wanted to find a person's favourite pizza toppings, the genotype and fitness would be different from that which is used for this article's problem domain.
These two essential elements of a GA (for this article's problem domain) are specified below.
1. The Geneotype
private int[,] gene = new int[30, 10];
Well, for this article, the problem domain states that we have 10 cards. So, I created a two dimensional genes array, which is a 30*10 array. The 30 represents a population size of 30. I picked this. It could be any size, but should be big enough to allow some dominant genes to form.
2. The Fitness Function
Remembering that the problem domain description stated the following:
- You have 10 cards numbered 1 to 10
- You have to divide them into two piles so that:
- The sum of the first pile is as close as possible to 36.
- And the product of all in the second pile is as close as possible to 360.
Well, all that is being done is the following :
- Loop through the population member's genes
- If the current gene being looked at has a value of 0, the gene is for the sum pile (pile 0), so add to the running calculation
- If the current gene being looked at has a value of 1, the gene is for the product pile (pile 1), so add to the running calculation
- Calculate the overall error for this population member. If this member's geneotype has an overall error of 0.0, then the problem domain has been solved
private double evaluate(int n)
{
int sum = 0, prod = 1;
double scaled_sum_error, scaled_prod_error, combined_error;
for (int i = 0; i < LEN; i++)
{
if (gene[n,i] == 0)
{
sum += (1 + i);
}
else
{
prod *= (1 + i);
}
}
scaled_sum_error = (sum - SUMTARG) / SUMTARG;
scaled_prod_error = (prod - PRODTARG) / PRODTARG;
combined_error = Math.Abs(scaled_sum_error) + Math.Abs(scaled_prod_error);
return combined_error;
}
Using the code
The demo project attached actually contains a Visual Studio 2005 solution, with the following two classes.
Program class
Is the main entry point into the Simple_GeneticAlgorithm application. All this class does is create a new Simple_GeneticAlgorithm
object and call its run()
method.
using System;
using System.Collections.Generic;
using System.Text;
namespace Simple_GeneticAlgorithm
{
class Program
{
static void Main(string[] args)
{
Simple_GeneticAlgorithm GA = new Simple_GeneticAlgorithm();
GA.run();
Console.ReadLine();
}
}
}
Simple_GeneticAlgorithm class
Runs the GA to solve the problem domain.
using System;
using System.Collections.Generic;
using System.Text;
namespace Simple_GeneticAlgorithm
{
public class Simple_GeneticAlgorithm
{
private int POP = 30;
private int LEN = 10;
private double MUT = 0.1;
private double REC = 0.5;
private double END = 1000;
private double SUMTARG = 36;
private double PRODTARG = 360;
private int[,] gene = new int[30, 10];
Random rnd = new Random();
public Simple_GeneticAlgorithm()
{
}
public void run()
{
int a, b, Winner, Loser;
init_pop();
for (int tournamentNo = 0; tournamentNo < END; tournamentNo++)
{
a = (int)(POP * rnd.NextDouble());
b = (int)(POP * rnd.NextDouble());
if (evaluate(a) < evaluate(b))
{
Winner = a;
Loser = b;
}
else
{
Winner = b;
Loser = a;
}
for (int i = 0; i < LEN; i++)
{
if (rnd.NextDouble() < REC)
gene[Loser, i] = gene[Winner, i];
if (rnd.NextDouble() < MUT)
gene[Loser, i] = 1 - gene[Loser, i];
if (evaluate(Loser) == 0.0)
display(tournamentNo, Loser);
}
}
}
private void display(int tournaments, int n)
{
Console.WriteLine("\r\n==============================\r\n");
Console.WriteLine("After " + tournaments +
" tournaments, Solution sum pile " +
"(should be 36) cards are : ");
for (int i = 0; i < LEN; i++) {
if (gene[n,i] == 0) {
Console.WriteLine(i + 1);
}
}
Console.WriteLine("\r\nAnd Product pile " +
"(should be 360) cards are : ");
for (int i = 0; i < LEN; i++) {
if (gene[n,i] == 1) {
Console.WriteLine(i + 1);
}
}
}
private double evaluate(int n)
{
int sum = 0, prod = 1;
double scaled_sum_error, scaled_prod_error, combined_error;
for (int i = 0; i < LEN; i++)
{
if (gene[n,i] == 0)
{
sum += (1 + i);
}
else
{
prod *= (1 + i);
}
}
scaled_sum_error = (sum - SUMTARG) / SUMTARG;
scaled_prod_error = (prod - PRODTARG) / PRODTARG;
combined_error = Math.Abs(scaled_sum_error) +
Math.Abs(scaled_prod_error);
return combined_error;
}
private void init_pop()
{
for (int i = 0; i < POP; i++)
{
for (int j = 0; j < LEN; j++)
{
if (rnd.NextDouble() < 0.5)
{
gene[i,j] = 0;
}
else
{
gene[i,j] = 1;
}
}
}
}
}
}
The results
Taking the last good population member results found, let's test it out.
2 + 7 + 8 + 9 + 10 = 36 in Pile 0, this is all good
1 * 3 * 4 * 5 * 6 = 360 in Pile 1, this is all good
Points of Interest
I hope this article has demonstrated how to write a simple GA to solve a problem that we as humans would probably find hard to do manually. Remember this is a simple problem. what would happen if we upped the problem domain? A GA really is the way to go.
I will very shortly publish an article on a GA training a multi layer neural network to solve some logic problems. So if your'e into this sort of stuff, watch this space.
History