***********************************
Selection and Population Management
***********************************

* One of the main driving forces of evolutionary computation algorithms is the variation operators
* Another major driving force is the selection strategy

    * This determines an individual's ability to survive and have offspring based on their fitness values


* Selection is based on the *relative* fitness values of the individual
* It works independently from the actual problem and representation


Generational vs. Steady State
=============================

* All the algorithms provided so far have been implemented as a generational algorithm

    * A whole new population is created based on the current population
    * The new population becomes the current population
    * The population size remains the same
    * There are very discrete transitions between generations


* With the generational algorithm, depending on how the genetic operators are applied, the new population is filled with

    * Copies of unchanged selected chromosomes
    * Mutated selected chromosomes
    * Offspring resulting from crossover on selected chromosomes


* On the other hand, steady state algorithms provide a more continuous type of evolution

    * Not as clear discrete transitions between generations


* With steady state

    * Given a single population of size :math:`\mu`
    * Select some number of chromosomes :math:`\lambda` and apply the variation operators to produce the offspring
    * Replace :math:`\lambda` chromosomes in the population with the offspring


* The *generational gap* is the proportion of the the population that was replaced

    * :math:`\frac{\lambda}{\mu}`


* If the number of chromosomes selected is equal to the population size, it is equivalent to a generational algorithm

    * If :math:`\lambda = \mu`
    * When the generational gap is :math:`\frac{\lambda}{\mu} = 1`


* With a generational algorithm, the whole population is replaced
* But with a generational gap less than one, there needs to be another selection mechanism for replacement

    * Determining which candidate solutions of the population are to be replaced with the :math:`\lambda` offspring


Tournament Selection
====================

* Two very basic selections *could* be implemented

#. Uniform selection

    * Each chromosome has an equal chance at being selected

        * :math:`p(i) = \frac{1}{\mu}`
        * Where :math:`p(i)` is the probability of an individual chromosome being selected


    * Some forms of evolutionary computation use this selection exclusively
    * However, in general, it's not going to perform well as there is nothing guiding the search

        * The algorithm will not converge
        * There is low *selection pressure*


#. Select the top chromosomes only and apply the genetic operators

    * As already seen, this will not perform well as the population will converge too quickly on some local optimum
    * There is high *selection pressure*


* An alternative highly effective selection is to go somewhere in between these ideas
* This is where tournament selection comes in

    * Randomly pick a subset of :math:`k` chromosomes from the population
    * Select the best of these :math:`k` chromosomes


.. figure:: tournament_selection.png
    :width: 600 px
    :align: center
    :target: https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_parent_selection.htm

    Tournament selection being performed on a population of size 13. Here, :math:`k = 3`, meaning three chromosomes were
    selected at random. Of the three, the candidate solution with the highest fitness is then returned as the selected
    chromosome.


* The value of :math:`k` is typically kept low, but can be adjusted as needed

    * If :math:`k = 1`, this would be the same as a uniform selection
    * If :math:`k = \mu`, this would be the same as always selecting the best
    * As :math:`k` increases, so does the selection pressure


Fitness Proportional Selection
==============================

* There are a collection of selection strategies that are fitness proportional

    * The probability of selecting an individual depends on its fitness value compared to the whole population's fitness


* The simplest of these is to have the probability be the individual's proportion of the total population fitness

    * :math:`p(i) = \frac{f(i)}{\sum^{\mu}_{j}f(j)}`
    * Where :math:`p(i)` is the probability of selecting individual :math:`i`
    * And :math:`f(i)` is the fitness of individual :math:`i`
    * Note that the sum of the probabilities must be one :math:`\sum^{\mu}_{j=1}p(j) = 1`


* The benefit here is that by selecting highly fit individuals, the search may produce high quality solutions quickly
* However, this may cause the population to get stuck in a local maximum

    * Premature convergence
    * It is important to ensure the population stays diverse


* Further, when the population begins to converge, this strategy becomes similar to a uniform selection

    * Learning will stagnate when the fitness values of the chromosomes within the population are similar


* Modifications to this strategy include adding some constant or *windowing*

    * Windowing is subtracting the minimum fitness within the population from all individuals within the population


* Several selection probabilities are shown in the below table for some maximization problem

    * One regular, one with adding some constant, and one and windowing


* Here, adding a constant made the lowest fit individual more likely to be selected
* While also making the selection probabilities more similar

* With windowing, the lowest fit individual had zero probability of being selected
* While also making the selection probabilities more different


.. list-table:: Modifications to Fitness Proportional Selection
    :widths: 50 50 50 50 50 50
    :header-rows: 1

    * - :math:`i`
      - :math:`f(i)`
      - :math:`p(i)` for :math:`f(i)`
      - :math:`f(i) + 10`
      - :math:`p(i)` for :math:`f(i) + 10`
      -  Windowing :math:`p(i)` for :math:`f(i)`
    * - A
      - :math:`1`
      - :math:`0.1`
      - :math:`11`
      - :math:`0.257`
      - :math:`0`
    * - B
      - :math:`4`
      - :math:`0.4`
      - :math:`14`
      - :math:`0.350`
      - :math:`0.333`
    * - C
      - :math:`5`
      - :math:`0.5`
      - :math:`15`
      - :math:`0.375`
      - :math:`0.666`
    * - **Total**
      - **10**
      - **1.0**
      - **40**
      - **1.0**
      - **1.0**


Rank Based Selection
^^^^^^^^^^^^^^^^^^^^

* An alternative is to take the same idea, but rank the candidate solutions
* Then, have the probability of being selected related to their rank, not their fitness value

.. list-table:: Rank Based Proportional Selection
    :widths: 50 50 50 50 50
    :header-rows: 1

    * - :math:`i`
      - :math:`f(i)`
      - :math:`p(i)` for :math:`f(i)`
      - Ranking
      - :math:`p(i)` for Ranking
    * - A
      - :math:`1`
      - :math:`0.1`
      - :math:`0`
      - :math:`0`
    * - B
      - :math:`4`
      - :math:`0.4`
      - :math:`1`
      - :math:`0.333`
    * - C
      - :math:`5`
      - :math:`0.5`
      - :math:`2`
      - :math:`0.666`
    * - **Total**
      - **10**
      - **1.0**
      -
      - **1.0**


* The above table shows a simple linear ranking being used, but other more complex forms may be used
* The benefit of this strategy is that, as the population converges, selection pressure will not become lower
* The downside is the computational overhead of ranking each candidate solution


Roulette Wheel Selection
^^^^^^^^^^^^^^^^^^^^^^^^

* Fitness proportional selection is commonly referred to as a *roulette wheel* selection


.. figure:: roulette_wheel_selection.png
    :width: 600 px
    :align: center
    :target: https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_parent_selection.htm

    Visualization of roulette wheel selection. Each individual is assigned a piece of the wheel proportional to that
    individual's proportion of fitness of the population's total fitness.


Stochastic Universal Sampling
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* Typically, the roulette wheel has one arm and is *spun* :math:`\lambda` times
* An alternative is to spin a roulette wheel with :math:`\lambda` arms once

    * This means the wheel is spun only once
    * This produces a diverse set of selected individuals


.. figure:: stochastic_universal_sampling_selection.png
    :width: 600 px
    :align: center
    :target: https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_parent_selection.htm

    Example of a roulette wheel where :math:`\lambda=2`.


.. warning::

    Although possible, fitness proportional selections often requires some alterations to work with algorithms
    minimizing fitness or negative fitness values.


Survivor Selection
==================

* With a steady state algorithm, survivor selection is needed

    * Which individuals survive and which are replaced


* Any rules could be used, but common ones include

    * Random

        * Randomly select :math:`\lambda` to replace


    * Age based

        * Replace the :math:`\lambda` oldest individuals
        * Ignores fitness
        * Might kill off the best individuals


    * Fitness based

        * Replace the :math:`\lambda` worst individuals based on fitness
        * Will kill off the worst candidate solutions
        * May cause the population to converge prematurely


    * Diversity based

        * Replace :math:`\lambda` candidate solutions that are the most similar
        * The idea is, replace those that add little diversity to the population


    * Similarity based

        * Replace the :math:`\lambda` candidate solutions with the most similar fitness to the offspring
        * Similar to diversity and easy to implement with an assumption
        * Those with a similar fitness may be similar in the genotype space


Elitism
^^^^^^^

* Regardless of using a generational or steady state algorithm, elitism is commonly used in evolutionary computation
* The idea is, always have a copy of the best :math:`x` chromosomes in the population

    * Typically :math:`x=1`
    * Having :math:`x` too high will cause premature convergence


Diversity
=========

* Diversity is how much the members of the populations differ from one another
* Diversity is important within a population

    * Balances exploration and exploitation
    * It helps to prevent premature convergence


* As with anything with evolutionary computation, there are no rules on what should be done
* Below are some high-level ideas, but is in no way exhaustive

    * Explicitly add a diversity measure to the fitness calculation
    * Only have similar individuals compete with one another
    * *The Island Model*

        * Distribute the population into sub-populations that evolve independently with periodic migrations


    * *Ring Species*

        * Only allow individuals to mate if they are *close* to one another within the population


.. figure:: island_model.png
    :width: 250 px
    :align: center

    Island model layout with three sub-populations. Each of the three sub-populations evolve independently. This allows
    each sub-population to explore the search space along its own path, thereby preserving diversity between the
    sub-populations. Periodically, members of the sub-populations migrate to other sub-populations to introduce
    diversity to the individual sub-populations.


.. figure:: ring_species.png
    :width: 250 px
    :align: center

    Ring species treats the population as a ring/circle and mating can only occur between chromosomes if they exist
    within some distance (number of indices) of one another. In this example, the distance :math:`d=3`, therefore only
    the candidate solutions at indices 22, 23, 24, 1, 2, 3, and 4 would be eligible for mating.


* Further, there are no rules on where the diversity should be measured

    * Genotype space

        * How similar the chromosomes are
        * Depends on the encoding
        * Hamming distance?
        * Levenshtein distance?


    * Phenotype space

        * How different are the phenotypes (what the chromosomes represent)
        * May be very different from the genotype space's distance


    * Algorithm space

        * Some *distance* based on the algorithm's framework
        * For example, ring species and the island model


For Next Class
==============

* TBD