|
1 |
| -# todo |
2 |
| -refactoring https://gist.github.com/blainerothrock/efda6e12fe10792c99c990f8ff3daeba for swift 4. Creating a tutorial and writing in more playground friendly format |
| 1 | +# Genetic Algorthim |
| 2 | + |
| 3 | +## What is it? |
| 4 | + |
| 5 | +A genetic algorithm (GA) is process inspired by natural selection to find high quality solutions. Most commonly used for optimization. GAs rely on the bio-inspired processes of natural selection, more specifically the process of selection (fitness), mutation and crossover. To understand more, let's walk through these process in terms of biology: |
| 6 | + |
| 7 | +### Selection |
| 8 | +>**Selection**, in biology, the preferential survival and reproduction or preferential elimination of individuals with certain genotypes (genetic compositions), by means of natural or artificial controlling factors. [Britannica](britannica) |
| 9 | +
|
| 10 | +In other words, survival of the fittest. Organism that survive in their environment tend to reproduce more. With GAs we generate a fitness model that will rank offspring and give them a better chance for reproduction. |
| 11 | + |
| 12 | +### Mutation |
| 13 | +>**Mutation**, an alteration in the genetic material (the genome) of a cell of a living organism or of a virus that is more or less permanent and that can be transmitted to the cell’s or the virus’s descendants. [Britannica](https://www.britannica.com/science/mutation-genetics) |
| 14 | +
|
| 15 | +The randomization that allows for organisms to change over time. In GAs we build a randomization process that will mutate offspring in a populate in order to randomly introduce fitness variance. |
| 16 | + |
| 17 | +### Crossover |
| 18 | +>**Chromosomal crossover** (or crossing over) is the exchange of genetic material between homologous chromosomes that results in recombinant chromosomes during sexual reproduction [Wikipedia](https://en.wikipedia.org/wiki/Chromosomal_crossover) |
| 19 | +
|
| 20 | +Simply reproduction. A generation will a mixed representation of the previous generation, with offspring taking data (DNA) from both parents. GAs do this by randomly, but weightily, mating offspring to create new generations. |
| 21 | + |
| 22 | +### Resources: |
| 23 | +* [Wikipedia]() |
| 24 | + |
| 25 | + |
| 26 | +## The Code |
| 27 | + |
| 28 | +### Problem |
| 29 | +For this quick and dirty example, we are going to obtain a optimize string using a simple genetic algorithm. More specifically we are trying to take a randomly generated origin string of a fixed length and evolve it into the most optimized string of our choosing. |
| 30 | + |
| 31 | +We will be creating a bio-inspired world where the absolute existence is string `Hello, World!`. Nothing in this universe is better and it's our goal to get as close to it as possible. |
| 32 | + |
| 33 | +### Define the Universe |
| 34 | + |
| 35 | +Before we dive into the core processes we need to set up our "universe". First let's define a lexicon, a set of everything that exists in our universe. |
| 36 | + |
| 37 | +```swift |
| 38 | +let lex: [UInt8] = " !\"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~".asciiArray |
| 39 | +``` |
| 40 | + |
| 41 | +To make things easier, we are actually going to work in ASCII values, so let's define a String extension to help with that. |
| 42 | + |
| 43 | +```swift |
| 44 | +extension String { |
| 45 | + var asciiArray: [UInt8] { |
| 46 | + return [UInt8](self.utf8) |
| 47 | + } |
| 48 | +} |
| 49 | +``` |
| 50 | + |
| 51 | + Now, let's define a few global variables for the universe: |
| 52 | + |
| 53 | + ```swift |
| 54 | +// This is the end goal and what we will be using to rate fitness. In the real world this will not exist |
| 55 | +let OPTIMAL:[UInt8] = "Hello, World".asciiArray |
| 56 | + |
| 57 | +// The length of the string in our population. Organisms need to be similar |
| 58 | +let DNA_SIZE = OPTIMAL.count |
| 59 | + |
| 60 | +// size of each generation |
| 61 | +let POP_SIZE = 50 |
| 62 | + |
| 63 | +// max number of generations, script will stop when it reach 5000 if the optimal value is not found |
| 64 | +let GENERATIONS = 5000 |
| 65 | + |
| 66 | +// The chance in which a random nucleotide can mutate (1/n) |
| 67 | +let MUTATION_CHANCE = 100 |
| 68 | + ``` |
| 69 | + |
| 70 | + The last piece we need for set up is a function to give us a random ASCII value from our lexicon: |
| 71 | + |
| 72 | + ```swift |
| 73 | + func randomChar(from lexicon: [UInt8]) -> UInt8 { |
| 74 | + let len = UInt32(lexicon.count-1) |
| 75 | + let rand = Int(arc4random_uniform(len)) |
| 76 | + return lexicon[rand] |
| 77 | + } |
| 78 | + ``` |
| 79 | + |
| 80 | + ### Population Zero |
| 81 | + |
| 82 | + Before selecting, mutating and reproduction, we need population to start with. Now that we have the universe defined we can write that function: |
| 83 | + |
| 84 | + ```swift |
| 85 | + func randomPopulation(from lexicon: [UInt8], populationSize: Int, dnaSize: Int) -> [[UInt8]] { |
| 86 | + |
| 87 | + let len = UInt32(lexicon.count) |
| 88 | + |
| 89 | + var pop = [[UInt8]]() |
| 90 | + |
| 91 | + for _ in 0..<populationSize { |
| 92 | + var dna = [UInt8]() |
| 93 | + for _ in 0..<dnaSize { |
| 94 | + let char = randomChar(from: lexicon) |
| 95 | + dna.append(char) |
| 96 | + } |
| 97 | + pop.append(dna) |
| 98 | + } |
| 99 | + return pop |
| 100 | +} |
| 101 | + ``` |
| 102 | + |
| 103 | +### Selection |
| 104 | + |
| 105 | +There are two parts to the selection process, the first is calculating the fitness, which will assign a rating to a individual. We do this by simply calculating how close the individual is to the optimal string using ASCII values: |
| 106 | + |
| 107 | +```swift |
| 108 | +func calculateFitness(dna:[UInt8], optimal:[UInt8]) -> Int { |
| 109 | + var fitness = 0 |
| 110 | + for c in 0...dna.count-1 { |
| 111 | + fitness += abs(Int(dna[c]) - Int(optimal[c])) |
| 112 | + } |
| 113 | + return fitness |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +The above is a very simple fitness calculation, but it'll work for our example. In a real world problem, the optimal solution is unknown or impossible. [Here](https://iccl.inf.tu-dresden.de/w/images/b/b7/GA_for_TSP.pdf) is a paper about optimizing a solution for the famous [traveling salesman problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem) using GA. In this example the problem is unsolvable by modern computers, but you can rate a individual solution by distance traveled. The optimal fitness here is an impossible 0. The closer the solution is to 0, the better chance for survival. |
| 118 | + |
| 119 | +The second part to selection is weighted choice, also called roulette wheel selection. This defines how individuals are selected for the reproduction process out of the current population. Just because you are the best choice for natural selection doesn't mean the environment will select you. The individual could fall off a cliff, get dysentery or not be able to reproduce. |
| 120 | + |
| 121 | +Let's take a second and ask why on this one. Why would you not always want to select the most fit from a population? It's hard to see from this simple example, but let's think about dog breeding, because breeders remove this process and hand select dogs for the next generation. As a result you get improved desired characteristics, but the individuals will also continue to carry genetic disorders that come along with those traits. This is essentially leading the evolution down a linear path. A certain "branch" of evolution may beat out the current fittest solution at a later time. |
| 122 | + |
| 123 | +ok, back to code. Here is our weighted choice function: |
| 124 | + |
| 125 | +```swift |
| 126 | +func weightedChoice(items:[(item:[UInt8], weight:Double)]) -> (item:[UInt8], weight:Double) { |
| 127 | + |
| 128 | + let total = items.reduce(0.0) { return $0 + $1.weight} |
| 129 | + |
| 130 | + var n = Double(arc4random_uniform(UInt32(total * 1000000.0))) / 1000000.0 |
| 131 | + |
| 132 | + for itemTuple in items { |
| 133 | + if n < itemTuple.weight { |
| 134 | + return itemTuple |
| 135 | + } |
| 136 | + n = n - itemTuple.weight |
| 137 | + } |
| 138 | + return items[1] |
| 139 | +} |
| 140 | +``` |
| 141 | + |
| 142 | +The above function takes a list of individuals with their calculated fitness. Then selects one at random offset by their fitness value. |
| 143 | + |
| 144 | +## Mutation |
| 145 | + |
| 146 | +The all powerful mutation. The great randomization that turns bacteria into humans, just add time. So powerful yet so simple: |
| 147 | + |
| 148 | +```swift |
| 149 | +func mutate(dna:[UInt8], mutationChance:Int) -> [UInt8] { |
| 150 | + var outputDna = dna |
| 151 | + |
| 152 | + for i in 0..<dna.count { |
| 153 | + let rand = Int(arc4random_uniform(UInt32(mutationChance))) |
| 154 | + if rand == 1 { |
| 155 | + outputDna[i] = randomChar() |
| 156 | + } |
| 157 | + } |
| 158 | + |
| 159 | + return outputDna |
| 160 | +} |
| 161 | +``` |
| 162 | + |
| 163 | +Takes a mutation chance and a individual and returns that individual with mutations, if any. |
| 164 | + |
| 165 | +This allows for a population to explore all the possibilities of it's building blocks and randomly stumble on a better solution. If there is too much mutation, the evolution process will get nowhere. If there is too little the populations will become too similar and never be able to branch out of a defect to meet their changing environment. |
| 166 | + |
| 167 | +## Crossover |
0 commit comments