Genetic Algorithms in Artificial Neural Network Classification Problems

Chesnokov Yuriy

4.76/5 (12 votes)

20 May 2008GPL38 min read

3.7K

The article demonstrates the application of genetic algorithms for classification problems with artificial neural networks.

Introduction

Nature provides inherent methods for solving optimization problems, called genetic algorithms. Live organisms evolve, adapt to changing environments, mate and produce individuals even more fitter than its predecessors. The fitness of the individual denotes its ability to survive or to be fitter for a particular purpose. A genetic algorithm (GA) is a method for solving optimization problems that is based on natural selection, the process that drives biological evolution. A genetic algorithm repeatedly modifies a population of individual solutions. At each step, a genetic algorithm selects individuals at random from the current population to be parents, and uses them to produce the children for the next generation. Over successive generations, the population 'evolves' towards an optimal solution. You can apply a genetic algorithm to solve a variety of optimization problems that are not well suited for standard optimization algorithms, including problems in which the objective function is discontinuous, non-differentiable, stochastic, or highly nonlinear. In this article, I will demonstrate how GAs can be applied to train artificial neural networks for classification purposes. Its advantage is that it searches the entire space for the best solution at the expense of speed, though compared to numerical methods. However, the latter tend to converge to local minima often. Both GA and numerical methods have their benefits and disadvantages, and could be used interchangeably for specific problems.

Background

You need to be familiar with genetic algorithms. Have a look at the sites I used when studied the topic: Genetic Algorithms, Genetic Programming, Genetic Algorithms in Plain English. You will have to understand crossover, mutation, and selection processes to be able to use my code intelligently. The console interface to the neural network and the file structure description can be found in my previous article: Backpropagation Artificial Neural Network in C++.

Using the code

The help line to the console application allows you to use these parameters:

argv[1] t-train, r-run
argv[2] network conf file
argv[3] cls1 files [0.9]
argv[4] cls2 files [0.1]
argv[5] epochs num
argv[6] [population size 50]
argv[7] [crossover chance 2]
argv[8] [mutation chance 1000]
 argv[9] [validation class]
 argv[10] [test class]
 argv[12] [validation TH 0.5]
 argv[12] [validation type [mse]]
argv[13] [normalization]: [0-no], 1-minmax, 2-zscore,
                           3-softmax, 4-energy
argv[14] [error tolerance cls] +- 0.05 default

argv[1] r-run
argv[2] network conf file
argv[3] cls files
argv[4] [validation TH 0.5]
argv[5] [normalization]: [0-no], 1-minmax, 2-zscore, 3-softmax, 4-energy

ann1dn.exe t net.nn cls1 cls2 3000 [50]
           [crossover rate] [mutation rate]
           [tst.txt][val.txt][TH [0.5]][vtype [mse]]
           [normalization [0]] [err [0.05]]
ann1dn.exe r net.nn testcls [TH [0.5]] [normalization [0]]

The only difference from the Backpropagation Artificial Neural Network in C++ project is the addition of GA specific parameters: 6, 7, and 8. Otherwise, everything is alike. They denote the initial population size, the crossover chance, and the mutation chance. Default values are 50, 2, and 1000. The crossover and mutation rates mean that there is 1 chance out of 2 for crossover of the genes for every chromosome of the two parents, and the mutation chance is 1 out of 1000 for mutation of particular gene bits during the mating process. So, you may just start training right now:

>ann1dg.exe t iris.nn setosa.txt virgi.txt 200

In the run.bat file I appended, you may find this command line to execute:

>ann1dg.exe t iris.nn setosa.txt virgi.txt 200   50 2 100  void void 0.5 0   2

It uses a population size of 50 organisms, with a crossover chance equal to 2 and a mutation chance equal to 100. The next parameters mean a random selection of validation and test sets from the training set using the mean squared error as a cross validation metric and the Zscore normalization of the input data. The normalization is necessary if the data range differs significantly from the preferred range; about -1.0 to 1.0 is suitable for a neural network classification.

A typical GA process is presented below to classify IRIS data to setosa and versi from the virgi species:

loading data...
 cls1: 100  cls2: 50  files loaded.  size: 4 samples
 validaton size: 25 12
 validaton size: 26 13
normalizing zscore...
training...
  epoch: 1   0.454 0.450   pop age:0 mean[0.00] fit:0.00
  bOrg: 0.715 0.546  fit 9.68 age 0 
  epoch: 2   0.642 0.631   pop age:1 mean[1.00] fit:238.01
  bOrg: 0.715 0.546  fit 9.68 age 1 
  epoch: 3   0.630 0.617   pop age:2 mean[1.64] fit:334.49
  bOrg: 0.665 0.341  fit 11.63 age 0 
  epoch: 4   0.638 0.617   pop age:3 mean[2.18] fit:288.08
  bOrg: 0.665 0.341  fit 11.63 age 1 
  epoch: 5   0.607 0.577   pop age:4 mean[2.48] fit:395.51
  bOrg: 0.665 0.341  fit 11.63 age 2 
  epoch: 6   0.636 0.602   pop age:5 mean[2.90] fit:456.78
  bOrg: 0.685 0.341  fit 11.80 age 0 
  epoch: 7   0.638 0.583   pop age:6 mean[3.34] fit:390.31
  bOrg: 0.677 0.452  fit 12.09 age 0 
  epoch: 8   0.651 0.562   pop age:7 mean[3.52] fit:425.81
  bOrg: 0.764 0.513  fit 12.71 age 0 
  epoch: 9   0.649 0.535   pop age:8 mean[3.45] fit:415.64
  bOrg: 0.826 0.455  fit 17.96 age 0 
  epoch: 10   0.608 0.445   pop age:9 mean[3.02] fit:408.99
  bOrg: 0.826 0.455  fit 17.96 age 1   vld: 9.71 (epoch 10) se:68.00 sp:100.00 ac:78.38
  epoch: 11   0.650 0.445   pop age:10 mean[2.98] fit:463.34
  bOrg: 0.826 0.455  fit 17.96 age 2   vld: 9.71 (epoch 10) se:68.00 sp:100.00 ac:78.38
  epoch: 12   0.615 0.405   pop age:11 mean[3.24] fit:467.19
  bOrg: 0.826 0.455  fit 17.96 age 3   vld: 9.71 (epoch 10) se:68.00 sp:100.00 ac:78.38
  epoch: 13   0.666 0.454   pop age:12 mean[2.98] fit:532.92
  bOrg: 0.826 0.455  fit 17.96 age 4   vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
  epoch: 14   0.662 0.401   pop age:13 mean[3.19] fit:627.14
  bOrg: 0.826 0.455  fit 17.96 age 5   vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
  epoch: 15   0.667 0.390   pop age:14 mean[3.35] fit:517.49
  bOrg: 0.711 0.216  fit 22.49 age 0   vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
  epoch: 16   0.662 0.372   pop age:15 mean[3.08] fit:508.93
  bOrg: 0.711 0.216  fit 22.49 age 1   vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
  epoch: 17   0.638 0.339   pop age:16 mean[2.65] fit:732.92
  bOrg: 0.711 0.216  fit 22.49 age 2   vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
  epoch: 18   0.676 0.366   pop age:17 mean[3.27] fit:793.93
  bOrg: 0.711 0.216  fit 22.49 age 3   vld: 33.02 (epoch 18) se:100.00 sp:100.00 ac:100.00
  epoch: 19   0.648 0.333   pop age:18 mean[3.35] fit:741.40
  bOrg: 0.711 0.216  fit 22.49 age 4   vld: 33.02 (epoch 18) se:100.00 sp:100.00 ac:100.00
  
  ...
  ...
  
  epoch: 187   0.747 0.213   pop age:186 mean[1.85] fit:2343.14
  bOrg: 0.778 0.243  fit 42.80 age 0   vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
  epoch: 188   0.747 0.213   pop age:187 mean[2.23] fit:1630.09
  bOrg: 0.778 0.243  fit 42.80 age 1   vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
  epoch: 189   0.747 0.213   pop age:188 mean[2.17] fit:2050.81
  bOrg: 0.778 0.243  fit 42.80 age 2   vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
  epoch: 190   0.748 0.214   pop age:189 mean[2.36] fit:1899.64
  bOrg: 0.778 0.243  fit 42.81 age 0   vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
  epoch: 191   0.751 0.217   pop age:190 mean[2.41] fit:1692.95
  bOrg: 0.778 0.243  fit 42.81 age 1   vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
  epoch: 192   0.752 0.223   pop age:191 mean[2.00] fit:1869.26
  bOrg: 0.778 0.243  fit 42.81 age 0   vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
  epoch: 193   0.755 0.221   pop age:192 mean[2.00] fit:1965.02
  bOrg: 0.778 0.243  fit 42.82 age 0   vld: 81.72 (epoch 193) se:100.00 sp:100.00 ac:100.00
  epoch: 194   0.762 0.227   pop age:193 mean[2.42] fit:2118.48
  bOrg: 0.778 0.243  fit 42.82 age 1   vld: 81.72 (epoch 193) se:100.00 sp:100.00 ac:100.00
  epoch: 195   0.766 0.232   pop age:194 mean[2.63] fit:2538.75
  bOrg: 0.778 0.243  fit 42.82 age 2   vld: 81.72 (epoch 193) se:100.00 sp:100.00 ac:100.00
  epoch: 196   0.778 0.244   pop age:195 mean[2.66] fit:2140.18
  bOrg: 0.778 0.243  fit 42.82 age 3   vld: 81.77 (epoch 196) se:100.00 sp:100.00 ac:100.00
  epoch: 197   0.778 0.243   pop age:196 mean[2.86] fit:2140.25
  bOrg: 0.778 0.243  fit 42.82 age 4   vld: 81.77 (epoch 196) se:100.00 sp:100.00 ac:100.00
  epoch: 198   0.778 0.243   pop age:197 mean[2.76] fit:2311.59
  bOrg: 0.778 0.243  fit 42.82 age 0   vld: 81.81 (epoch 198) se:100.00 sp:100.00 ac:100.00
  epoch: 199   0.778 0.243   pop age:198 mean[2.52] fit:2226.04
  bOrg: 0.778 0.243  fit 42.82 age 1   vld: 81.81 (epoch 198) se:100.00 sp:100.00 ac:100.00
  epoch: 200   0.777 0.242   pop age:199 mean[2.49] fit:1583.92
  bOrg: 0.778 0.243  fit 42.82 age 2   vld: 81.81 (epoch 198) se:100.00 sp:100.00 ac:100.00
training time: 00:00:00:781

classification results: maxacur.nn
 
 train set: 49 25
   sensitivity: 97.96
   specificity: 96.00
   +predictive: 97.96
   -predictive: 96.00
      accuracy: 97.30
 
 validation set: 25 12
   sensitivity: 100.00
   specificity: 100.00
   +predictive: 100.00
   -predictive: 100.00
      accuracy: 100.00
 
 test set: 26 13
   sensitivity: 88.46
   specificity: 100.00
   +predictive: 100.00
   -predictive: 81.25
      accuracy: 92.31

The single epoch output means for the last 200 epoch example: a 0.777 mean output of the population on the train set for positive class, 0.242 - for negative class, population age 199, mean age of the organisms 2.49, mean fitness 1583.92 (inverse of the MSE), bOrg is the most fittest organism with 0.788 mean output on the train set for positive class, 0.243 - for negative, with age 2, and its classification results on the validation set.

For comparison purposes, I ran GA training over several iterations, with random subdivisions of IRIS training data, and took the best classification rate configuration. The same I repeated for a neural network with backpropagation learning algorithm from my article: Backpropagation Artificial Neural Network in C++. Below, I present the network's weights and classification rates. We can see that the GA achieved very close performance results compared to the backpropagation algorithm:

GA results:

4
4 3 2 1 

0
1

-44.000000 0.115305
-22.000000 0.257352
-10.000000 0.057181
-1.000000 0.136029

-1.018195
1.106349
-0.252558
-3.816144
0.060086
2.665148
0.106655
-0.719681
1.192948
-2.096924
-0.999024
0.572130
-1.355700
0.017045
1.615307

-2.008180
-8154.750000
-0.141530
7.478271
-1.202083
0.817744
1.999939
-2.613441

-0.727891
-1.999817
14.607610

 train set: 49 25
   sensitivity: 100.00
   specificity: 100.00
   +predictive: 100.00
   -predictive: 100.00
      accuracy: 100.00
 
 validation set: 25 12
   sensitivity: 100.00
   specificity: 100.00
   +predictive: 100.00
   -predictive: 100.00
      accuracy: 100.00
 
 test set: 26 13
   sensitivity: 96.15
   specificity: 100.00
   +predictive: 100.00
   -predictive: 92.86
      accuracy: 97.44

The backpropagation algorithm results:

4
4 3 2 1 

0
1

-43.000000 0.109332
-23.000000 0.251434
-11.000000 0.055665
-1.000000 0.133365

7.644542
1.920880
0.046209
-4.151631
-2.031714
-8.133764
1.372707
-1.903027
2.587456
2.505832
1.232619
-0.139417
0.577583
1.139807
1.212673

-1.549961
-4.962042
4.220874
-1.322594
1.052067
2.433295
-2.301689
0.902922

0.755153
-4.881694
2.221259

 train set: 49 25
   sensitivity: 97.96
   specificity: 100.00
   +predictive: 100.00
   -predictive: 96.15
      accuracy: 98.65
 
 validation set: 25 12
   sensitivity: 100.00
   specificity: 100.00
   +predictive: 100.00
   -predictive: 100.00
      accuracy: 100.00
 
 test set: 26 13
   sensitivity: 100.00
   specificity: 100.00
   +predictive: 100.00
   -predictive: 100.00
      accuracy: 100.00

The genetic algorithm

The GA specific classes in the project are:

Gene
Chromosome
Organism
Population

I'll describe them in that order.

Usually, for classification processes where floating point numbers are involved, the individual gene is presented with that number, and the mutation operator just adds a random value to that number. But, in natural processes, the gene consists of a sequence of individual bits: 1 or 0. And, mutation just inverts a random bit in the gene. You will not be able to use bit inversion with floating numbers, as you may get either a NaN or a very, very large number approaching the floating point limits, and for neural network classifications, the typical range of the weights is about -1.0 to 1.0. To approach integer representation of the gene, but with floating point value, I use a 32 bit integer as an individual gene and obtain its floating value by division of the HI signed word by a LO signed word. The only necessary condition is to avoid zero words, as this scenario may lead to a division by zero or zero coefficients in the neural network, which may lead, during selection and mutation, to organisms with a majority of zero genes without actual information.

The Gene class is presented below, and its constructor uses either random initialization of the gene or just arranges a copy from another gene.

C++

Gene::Gene()
{
    //if MAX_RAND=0x7fff  mult by 2*rand()
    unsigned short lo = 0xFFFF & (2 * rand());
    while (!lo)
        lo = 0xFFFF & (2 * rand());
    unsigned short hi = 0xFFFF & (2 * rand());
    while (!hi)
        hi = 0xFFFF & (2 * rand());

    m_gene = lo | hi << 16;     
}

Gene::Gene(unsigned int gene): m_gene(gene)
{
}

class Gene
{
    friend class Population;
    friend class Chromosome;
public:
    Gene();                         //random gene
    Gene(unsigned int gene);        //copy gene

    inline float gene() const;

private:
    Gene(const Gene& gene);
    const Gene& operator=(const Gene& gene);

    unsigned int m_gene;     //32bit number  lo/hi non-zero values only
};

inline float Gene::gene() const
{
    short lo = 0xFFFF & m_gene;
    short hi = 0xFFFF & (m_gene >> 16);
    return (float)hi / (float)lo;
}

I'm using friends as a protected interface is not necessary in that library.

The next dilemma was on how to intelligently compose the structure of the organism consisting of chromosomes of genes, that is a neural network consisting of floating point weights, so that mating of the two parent neural network configurations will produce an improved offspring for a given classification task. Random interchange of the weights from different layers does not work by all means. I solved that problem by denoting every individual neuron in the network as a chromosome consisting of genes which represent the input connections to the neuron, that is floating point weights. In that way, the corresponding chromosomes, that is neurons, are used during the crossover process. That is, the first chromosome (neuron) in the selected parent (neural network) interchanges its weights with the first chromosome of the second selected parent (neural network), the second chromosome with the corresponding second one from that parent, and so on. In that way, the crossover is intelligent in the sense that there is no weights interchange from the neurons of the last layer with the neurons from the first layer.

As you can see, the whole neural network acts as the organism presented by the class Organism, and individual neurons are represented as a Chromosome object with the number of genes representing the number of input weight connections to that neuron. In this way, the neural network consists of 3 layers of 10 5 1 neurons, correspondingly presented by the organism consisting of 5 + 1 = 6 chromosomes with 5 * (10 + 1) + 1 * (5 + 1) = 55 + 6 = 61 total genes. The 1 added is the bias connection. The chromosomes from the first layer consists of 11 genes, and the last layer chromosome has 6 genes. That way, the chromosomes from the first layer of one parent interchange their genes with similar chromosomes of the first layer from the second parent.

The Chromosome class is shown below:

C++

Chromosome::Chromosome(int genes_number)
{
    for (int g = 0; g < genes_number; g++)
        m_genes.push_back(new Gene());
}
Chromosome::Chromosome(vector<Gene *> *genes)
{
    for (int g = 0; g < (int)genes->size(); g++)
        m_genes.push_back(new Gene(genes->at(g)->m_gene));
}
Chromosome::~Chromosome()
{
    for (int g = 0; g < get_genes_number(); g++)
        delete m_genes[g];
}

class Chromosome
{
    friend class Population;
    friend class Organism;
public:
    Chromosome(int genes_number);       //random genes constructor
    Chromosome(vector<Gene *> *pgenes); //copy genes constructor
    ~Chromosome();

    inline int get_genes_number() const;
    inline const Gene *get_gene(int g) const;

private:
    Chromosome(const Chromosome& chrom);
    const Chromosome& operator=(const Chromosome& chrom);

    vector<Gene *> m_genes;
};

inline int Chromosome::get_genes_number() const
{
    return (int) m_genes.size();
}

inline const Gene *Chromosome::get_gene(int g) const
{
    if (g > get_genes_number() - 1 || g < 0)
        return 0;
    return m_genes[g];
}

It contains the vector of genes, and you may either create random genes or copy them from another chromosome.

The Organism object is like a living organism with a number of chromosomes consisting of a different number of genes. The organism has age, the number of populations it lived, and the fitness to solve the classification task, typically the inverse of the mean squared error (MSE) on the train, validation or test sets. You may create it using a specific neural network configuration, the number of layers with the number of neurons per layer, or make a copy from an existing organism, copying its genes values.

C++

Organism::Organism(int layers_number, const int *neurons_per_layer): 
          m_chromosomes_number(0), m_genes_number(0), m_age(0), m_fitness(-1.0f)
{      
    for (int i = 0; i < USERDATA_SIZE; i++)
        user_data[i] = 0.0f;        

    //e.g.: 10 5 1   5chr of 10+1genes, 1chr of 5+1genes
    for (int l = 0; l < layers_number - 1; l++) {  //loop thru layers
        for (int c = 0; c < neurons_per_layer[l+1]; c++) {
            //init with number of genes+1 [including 1 bias neuron!]
            Chromosome *pchr = new Chromosome(neurons_per_layer[l] + 1); 
            m_chromosomes.push_back(pchr);

            for (int g = 0; g < neurons_per_layer[l] + 1; g++) {
                m_genes_number++;
                CHROM_GENE_INDEX pnt;
                pnt.c = m_chromosomes_number;
                pnt.g = g;
                m_gene_index.push_back(pnt);
            }
            m_chromosomes_number++;
        }
    }
}

Organism::Organism(const Organism& org) : m_chromosomes_number(org.m_chromosomes_number), 
                                          m_genes_number(org.m_genes_number),
                                          m_age(0), m_fitness(-1.0f)
{
    for (int i = 0; i < USERDATA_SIZE; i++)
        user_data[i] = org.user_data[i];        

    //copy chr/gene indices
    for (int i = 0; i < m_genes_number; i++)
         m_gene_index.push_back(org.m_gene_index[i]);

    //copy genes from template
    for (int c = 0; c < m_chromosomes_number; c++)
         m_chromosomes.push_back(new Chromosome(&org.m_chromosomes[c]->m_genes));
}

typedef struct _chrom_gene_index {
    int c;       //chromosome index
    int g;       //gene index
} CHROM_GENE_INDEX, *PCHROM_GENE_INDEX; 

#define USERDATA_SIZE 2

class Organism
{
    friend class Population;
public:
    //create organism with certain config
    Organism(int layers_number, const int *neurons_per_layer);
    Organism(const Organism& org); //copy constructor
    ~Organism();
    

    inline int get_chromosomes_number() const;
    inline const Chromosome *get_chromosome(int c) const;
    inline CHROM_GENE_INDEX get_gene_index(int i) const;
    inline float fitness() const;
    inline void fitness(float fitness);
    inline int age() const;
    inline void age(int age);                       

    //outputs on training set
    float user_data[USERDATA_SIZE]; 

private:        
    const Organism& operator=(const Organism& org);

    //chromosome array
    vector<Chromosome *> m_chromosomes;
    //index of individual gene  m_gene_index.size() = m_genes_number
    vector<CHROM_GENE_INDEX> m_gene_index;

    int m_chromosomes_number;   //total number of chromosomes
    int m_genes_number;         //total number of genes
    float m_fitness;            //fitness of organizm
    int m_age;                  //organism age

};

The m_gene_index is the array of CHROM_GENE_INDEX structures. It allows to randomly select a single gene from the chromosomes for a mutation process. It contains just the coordinates of the individual gene in a particular chromosome.

The Population class is the container of the organisms.

C++

#define PSZVAR 30+1     //variation of population size 30% from m_meansize
////////////////Population////////////////////////////////////////////////////
class Population
{        
    enum CROSOVER_TYPE {ONEPOINT, TWOPOINT, UNIFORM, ARITHMETIC};

public:
    Population(int mean_size, int layers_number, const int *neurons_per_layer);
    ~Population();
    
    void populate();        //remove unfit, mate parents, produce siblings

    inline int crossover_rate() const;
    inline void crossover_rate(int c_rate);
    inline int mutation_rate() const;
    inline void mutation_rate(int m_rate);
    inline int population_size() const;
    inline Organism *get_organism(int i);
    inline const Organism *get_organism(int i) const;
    inline int get_age() const;
    inline float get_fitness() const;
    inline float get_fittest_meanage() const;                               

private:
    Population(const Population& pop);
    const Population& operator=(const Population& pop);

    vector<Organism *> m_organisms;    

    //crossover rate 1 chance from m_crossover_rate for a chromosome
    int m_crossover_rate;
    //mutation rate 1 chance from m_mutation_rate  max=0x7FFF
    int m_mutation_rate;

    int m_mean_size;          //hard wired initial fittest parents size

    int m_age;                //population age
    float m_fittest_meanage;  //fittest organisms mean age
    float m_fitness;          //current population total fitness

    int m_parents_limit;      //current pop size of parents organisms
    int m_offsprings_limit;   //current pop size of offsprings organisms
    int m_fittest_limit;      //size for fittest ones from 
                              //[m_parents_limit + m_offsprings_limit]

    //get random fittest
    const Organism *run_roulette_wheel(float population_fitness) const;
    //mate_parents 2 organisms
    Organism *mate_parents(const Organism *X, const Organism *Y) const;
};

Population::Population(int mean_size, int layers_number, 
            const int *neurons_per_layer): m_mean_size(mean_size), 
            m_age(0), m_fittest_meanage(0.0f), m_fitness(0.0f), 
            m_crossover_rate(2), m_mutation_rate(1000)

{
    srand((unsigned int)time(0));

    //max size parents/offsprings
    //meansize +/- N%   parents
    float a = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
    m_parents_limit = m_mean_size + int(a * (float)m_mean_size);
    //  offsprings
    float b = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
    m_offsprings_limit = m_mean_size + int(b * (float)m_mean_size);
    //  fittest size
    float f = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
    m_fittest_limit = m_mean_size + int(f * (float)m_mean_size);

    //init population including parents/offsprings organisms
    for (int i = 0; i < (m_parents_limit + m_offsprings_limit); i++) {
        Organism *porg = new Organism(layers_number, neurons_per_layer);
        m_organisms.push_back(porg);
    }
}

You need to initialize it this way:

C++

//load ANN network/////////////////////////////
ann = new ANNetwork(argv[2]);
if (ann->status() < 0) {
    wprintf(L"failed to load network: %s", argv[2]);
    exit(1);
}
//...

//init GA code/////////////////////////////////
int *cnf = new int[ann->get_layers_number()];
for (int i = 0; i < ann->get_layers_number(); i++)
   cnf[i] = ann->get_layer(i)->get_neurons_number();

//parents_number individuals of ANN config
gpop = new Population(parents_number, ann->get_layers_number(), cnf);
gpop->crossover_rate(crossover_rate);
gpop->mutation_rate(mutation_rate);

delete[] cnf;

During the construction, it will create the initial population of parents and offsprings with random genes, and will define the limit of the fittest organisms that can pass to the next population, the rest just die out. For a population size of 50, it will create 50+-15 parents and offsprings. And, set up the limit of the fittest population size to 50+-15. For example, 60 parents and 48 offsprings total 60 + 48 = 108 organisms. If the fittest limit is set at 55, then after the estimation of the fitness for every organism on training data, the less fit 108 - 55 = 53 organisms will be removed from the population. And, the most fit 55 will survive to the next generation. And, the process is repeated, producing new offsprings, estimating the fitness on the training data and removing unfit individuals.

C++

void Population::populate()
//remove unfit, mate population
{
    m_age++;

    //selection process
    random_shuffle(m_organisms.begin(), m_organisms.end());
    while (population_size() > m_fittest_limit) {
    //while num of orgs exceeds fittest limit
        int ind = 0;
        Organism *unfit = m_organisms[ind];
        for (int i = 1; i < population_size(); i++) {
            if (m_organisms[i]->m_fitness <= unfit->m_fitness) {
                ind = i;
                unfit = m_organisms[i];
            }
        }
        delete unfit;
        m_organisms.erase(m_organisms.begin() + ind);
    }
    m_parents_limit = m_fittest_limit;  //offs becomes parents


    m_fittest_meanage = 0.0f;
    m_fitness = 0.0f;
    //increase age of fittest individuals
    for (int i = 0; i < population_size(); i++) {
        m_organisms[i]->m_age++;
        m_fitness += m_organisms[i]->m_fitness;
        m_fittest_meanage += m_organisms[i]->m_age;
    }
    m_fittest_meanage /= (float)population_size();


    //mate fittest individuals
    vector<Organism *> siblings;
    float b = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
    //offsprings size
    m_offsprings_limit = m_mean_size + int(b * (float)m_mean_size);
    float f = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
    //fittest size limit
    m_fittest_limit = m_mean_size + int(f * (float)m_mean_size);
    for (int s = 0; s < m_offsprings_limit; s++) {
        //roulette wheel selection
        const Organism *X = run_roulette_wheel(m_fitness);
        const Organism *Y = run_roulette_wheel(m_fitness);
        while (X == Y)
             Y = m_organisms[rand()%population_size()];

        //up to 5 childrens
        int cnum = rand() % 5;
        cnum++;
        for (int i = 0; i < cnum; i++) {
            siblings.push_back(mate_parents(X, Y));
            s++;
            if (!(s < m_offsprings_limit)) 
                break;
        }
    }

    //introduce siblings to population
    for (int s = 0; s < (int)siblings.size(); s++)
        m_organisms.push_back(siblings[s]);
}

For the selection, I use the roulette wheel method.

C++

inline const Organism *Population::run_roulette_wheel(float population_fitness) const
//get random fittest
{
    float psum = 0.0f;
    float ind = float(rand() % 100);  //0...99
    for (int i = 0; i < population_size(); i++) {
        //prcnt of the roulette wheel
        psum += 100.0f * (m_organisms[i]->m_fitness / population_fitness);
        if (psum > ind)
            return m_organisms[i];
    }

    return m_organisms[population_size()-1];
}

And, the mating followed by mutation is defined by that function:

C++

//mate_parents 2 organisms: 1,2 point, uniform, arithmetic crossover
inline Organism *Population::mate_parents(const Organism *X, 
                                          const Organism *Y) const 
{        
    Organism *child = new Organism(*X);

    int type = rand() % 2;//4;

    //choose crossover type
    switch (type) {        
    case ONEPOINT:
        for (int c = 0; c < child->m_chromosomes_number; c++) {
            if (m_crossover_rate > 1 && rand() % m_crossover_rate) continue;

            int gnum = child->m_chromosomes[c]->get_genes_number();
            if (gnum == 1) continue;

            //swap at random point
            int p = (rand() % (gnum - 1)) + 1;
            for (int g = 0; g < p; g++)
                child->m_chromosomes[c]->m_genes[g]->m_gene = 
                          Y->m_chromosomes[c]->m_genes[g]->m_gene;    
                //or swap(child->...->m_genes[g],
                //         Y->...->m_genes[g]) Gene class adresses
        }
        break;
        
        case TWOPOINT:
            for (int c = 0; c < child->m_chromosomes_number; c++) {
                if (m_crossover_rate > 1 && rand() % m_crossover_rate) continue;

                int gnum = child->m_chromosomes[c]->get_genes_number();
                if (gnum <= 2) continue;

                //swap at 2 random points only for 3 gene chromosomes
                int p1 = (rand() % (gnum - 1)) + 1;
                int p2 = (rand() % (gnum - 1)) + 1;
                while (p1 == p2)
                        p2 = (rand() % (gnum - 1)) + 1;
                if (p1 > p2) swap(p1, p2);

                for (int g = 0; g < p1; g++)
                        child->m_chromosomes[c]->m_genes[g]->m_gene = 
                                Y->m_chromosomes[c]->m_genes[g]->m_gene;
                for (int g = p2; g < gnum; g++)
                        child->m_chromosomes[c]->m_genes[g]->m_gene = 
                                Y->m_chromosomes[c]->m_genes[g]->m_gene;
            }
            break;

        //bad crossovers        
        case UNIFORM:
            for (int c = 0; c < child->m_chromosomes_number; c++) {
                if (rand() % m_crossover_rate) continue;

                for (int g = 0; g < child->m_chromosomes[c]->get_genes_number(); g++) {
                    unsigned int res = 0;
                    unsigned int gX = child->m_chromosomes[c]->m_genes[g]->m_gene;
                    unsigned int gY = Y->m_chromosomes[c]->m_genes[g]->m_gene;

                    for (int i = 0; i < 32; i++) { //32 bit per gene
                        if (rand() % 2)
                            res |= (1 << i) & gY;
                        else
                            res |= (1 << i) & gX;
                    }

                    if ((0xFFFF&res) && (res >> 16))
                        child->m_chromosomes[c]->m_genes[g]->m_gene = res;
                    else
                        child->m_chromosomes[c]->m_genes[g]->m_gene = gY;
                }
            }
            break;
        
        case ARITHMETIC:
            for (int c = 0; c < child->m_chromosomes_number; c++) {
                if (m_crossover_rate > 1 && rand() % m_crossover_rate) continue;

                for (int g = 0; g < child->m_chromosomes[c]->get_genes_number(); g++) {
                    unsigned int gX = child->m_chromosomes[c]->m_genes[g]->m_gene;
                    unsigned int gY = Y->m_chromosomes[c]->m_genes[g]->m_gene;
                    if ((0xFFFF&(gX&gY)) && ((gX&gY) >> 16))
                        child->m_chromosomes[c]->m_genes[g]->m_gene = gX & gY;
                    else
                        child->m_chromosomes[c]->m_genes[g]->m_gene = gY;
                }
            }
            break;
        }

        //mutate offspring
        if (rand() % m_mutation_rate == 0) {
            CHROM_GENE_INDEX *pnt;
            unsigned int gene, mask;
            //get rnd Num of chromosomes (genes) to mutate
            int N = rand() % (child->m_chromosomes_number);
            for (int n = 0; n < N + 1; n++) {
                //get rnd CHROM_GENE_INDEX in chr/gene indices array
                pnt = &child->m_gene_index[rand()%child->m_genes_number];
                gene = child->m_chromosomes[pnt->c]->m_genes[pnt->g]->m_gene;
                mask = 1 << (rand() % 32);
                if ((0xFFFF&(gene ^ mask)) && ((gene ^ mask) >> 16))
                //no ZERO hi/lo parts in gene
                    child->m_chromosomes[pnt->c]->m_genes[pnt->g]->m_gene = gene ^ mask;
                    //(dw &= ~mask; clever op)
            }
        }
        return child;
    }

Though, I use only 1 and 2 point crossovers, as others tend to provide bad results on classification tasks, but you may experiment yourself: just uncomment the 4 and substitute it for 2:

C++

int type = rand() % 4;

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)