Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Genetic algorithm /w Neural Network playing snake is not improving

Tags:

I am attempting to create a genetic algorithm to train a neural network, with the goal of playing the game snake.

The problem I am having is that the fitness of the generations isn't improving, it either sits still at the fitness one could expect from not giving any input to the game, or only gets worse after the first generation. I suspect it to be an issue with the Neural Network, however I am at a loss as to what it is.

Neural Network setup

24 Input Nodes

2 Hidden layers

8 Nodes per layer

4 Output Nodes (One for each direction the snake can take)

The input is an array of every direction the snake can see. For each direction it checks how far the distance is to either a wall, a fruit, or itself. The end result is an array with a length of 3*8 = 24.

The weights and biases are random floats between -1 and 1, generated when the network is created.

Genetic Algorithm setup

Population size: 50000

Parents chosen per generation: 1000

Keep top per generation: 25000 (New variable, seeing better results)

Mutation chance per child: 5%

(I have tried many different size ratios, although I'm still not sure what a typical ratio is.)

I am using single point crossover. Every array of weights and biases is crossed over between the parents, and passed on to the children (one child for each "version" of the crossover).

I am using what I think is roulette selection to select parents, I will post the exact method below.

The fitness of a snake is calculated with: age * 2**score (not anymore, more info in update), where age is how many turns the snake survived, and score is the amount of fruits it collected.

Details

Here is some pseudo-code to try to summarise how my genetic algorithm (should) work:

pop = Population(size=1000)  while True:  # Have yet to implement a 'converged' check     pop.calc_fitness()      new_pop = []      for i in range(n_parents):          parent1 = pop.fitness_based_selection()         parent2 = pop.fitness_based_selection()          child_snake1, child_snake2 = parent1.crossover(parent2)          if rand() <= mutate_chance:             child_snake.mutate()          new_pop.append(child_snake1, child_snake2)      pop.population = new_pop      print(generation_statistics)     gen += 1 

Here is the method I use to select a parent:

def fitness_based_selection(self):     """     A slection process that chooses a snake, where a snake with a higher fitness has a higher chance of being     selected     :return: The chosen snake's brain     """     sum_fitnesses = sum(list([snake[1] for snake in self.population]))      # A random cutoff digit.     r = randint(0, sum_fitnesses)      current_sum = 0      for snake in self.population:         current_sum += snake[1]         if current_sum > r:             # Return brain of chosen snake             return snake[0] 

It is worth noting that self.population is a list of snakes, where each snake is a list containing the NeuralNet controlling it, and the fitness the network achieved.

And here is the method for getting an output from a network from a game output, as I suspect there may be something I'm doing wrong here:

def get_output(self, input_array: np.ndarray):     """     Get output from input by feed forwarding it through the network      :param input_array: The input to get an output from, should be an array of the inputs     :return: an output array with 4 values of the shape 1x4     """      # Add biases then multiply by weights, input => h_layer_1, this is done opposite because the input can be zero     h_layer_1_b = input_array  + self.biases_input_hidden1     h_layer_1_w = np.dot(h_layer_1_b, self.weights_input_hidden1)     h_layer_1 = self.sigmoid(h_layer_1_w)  # Run the output through a sigmoid function      # Multiply by weights then add biases, h_layer_1 => h_layer_2     h_layer_2_w = np.dot(h_layer_1, self.weights_hidden1_hidden2)     h_layer_2_b = h_layer_2_w + self.biases_hidden1_hidden2     h_layer_2 = self.sigmoid(h_layer_2_b)      # Multiply by weights then add biases, h_layer_2 => output     output_w = np.dot(h_layer_2, self.weights_hidden2_output)     output_b = output_w + self.biases_hidden2_output      output = self.sigmoid(output_b)     return output 

When running the neural network manually, with a graphical version of the game enabled, it is clear the network almost never changes direction more than once. This confuses me, as I was under the impression that if all the weights and biases are generated randomly, the input would be processed randomly and give a random output, instead the output seems to change once on the first turn of the game, then never significantly change again.

When running the genetic algorithm, the highest fitness of each generation barely ever exceeds the fitness one would expect from a snake without input (in this case 16), which I suppose is correlated to the issue with the neural network. When it does exceed, the next generations will revert back to 16 again.

Any help with his issue would be appreciated immensely, I'm still new to this area, and I'm finding it really interesting. I'll gladly answer any more details if need be. My complete code can be found here, if anybody dares delve into that.

Update:

I changed a couple of things:

  • Fixed the weight/bias generation, previously they were only generating between 0 and 1.
  • Edited my crossover method to return two children per set of parents instead of just one.
  • Changed the fitness function to only be equal to the snake's age (For testing purposes)
  • Changed the population variables

Now the algorithm performs better, the first generation usually find a snake with a fitness of 14-16, meaning that the snake does make turns to avoid death, however it almost always only goes downhill from there. The first snake will actually have achieved a tactic of turning when near the east and north/south edges, but never the west edge. After the first generation the fitness tends to only get worse, eventually going back down to the lowest possible fitness. I'm at a loss at what is going wrong, but I have a feeling it might be something large that I've overlooked.

Update #2:

Figured I might as well mention some things I tried that didn't work:

  • Changed the nodes per hidden layer from 8 to 16, this made the snakes perform significantly worse.
  • Allowed the snake to turn back into itself, this also made the snakes perform worse.
  • Large (I think they're large, not sure what a standard pop size is.) population sizes of ~1 000 000, with ~1000 parents, no positive change.
  • 16 or 32 nodes per hidden layer, had seemingly little to no impact.
  • Fixed the mutate function to properly assign values between -1 and 1, no noticeable impact.

Update #3:

I changed a couple of things and started seeing better results. Firstly I stopped fruits from spawning to simplify the learning process, and instead gave the snakes a fitness that was equal to their age (how many turns/frames they survived), and after turning off normalization of the input array I got a snake with a fitness of 300! 300 is the maximum age a snake can have before dying of old age.

However the problem still exists in that after the first couple of generations the fitness will plummet, the first 1-5 generations may have a fitness of 300 (sometimes they don't, and have a low fitness instead, but I assume this is down to population size.), but after that the fitness of the generations will plummet down to ~20-30 and stay there.

Also if I turn fruits back on, the snakes get abysmal fitnesses again.Sometimes the first generation will achieve a snake capable of moving in loops and therefore getting a fitness of 300 without picking up any fruits, but this is almost never transferred to the next generation.

like image 269
Ben Avatar asked May 12 '18 19:05

Ben


People also ask

Why neural network is better than genetic algorithm?

Genetic algorithms usually perform well on discrete data, whereas neural networks usually perform efficiently on continuous data. Genetic algorithms can fetch new patterns, while neural networks use training data to classify a network.

Which algorithm is better than genetic algorithm?

Like genetic algorithms, memetic Algorithms are a population-based approach. They have shown that they are orders of magnitude faster than traditional genetic Algorithms for some problem domains. In a memetic algorithm the population is initialized at random or using a heuristic.

Do genetic algorithms use neural networks?

Genetic Algorithms are a type of learning algorithm, that uses the idea that crossing over the weights of two good neural networks, would result in a better neural network.

What is the difference between genetic algorithm and machine learning?

Genetic algorithms are used in artificial intelligence like other search algorithms are used in artificial intelligence — to search a space of potential solutions to find one which solves the problem. In machine learning we are trying to create solutions to some problem by using data or examples.


2 Answers

I noticed that in your pseudocode, upon creating each new generation, the parent generation is completely wiped out and only the child generation is retained. This can naturally lead to lowering fitness levels, since there is nothing guaranteeing that the offspring will have fitness levels comparable to those in the parent generation. In order to ensure that the fitness levels are non-decreasing, you must either merge the parent and child generations and prune the weakest members (which I would recommend), or you can require that the offspring-generating function produces offspring at least as fit as the parents (by many trials and errors).


If you decide to focus on the offspring-generator, one way to (somewhat) guarantee improved offspring is to implement asexual reproduction by way of simply adding a small amount of noise to each weight vector. If the noise level is small enough, you can generate improved offspring with a success rate of up to 50%. Bigger noise levels, however, allow faster improvement and they help to jump out of local optima, even if they have success rates lower than 50%.

like image 116
Default picture Avatar answered Oct 11 '22 13:10

Default picture


You are only mutating 5% of the population, not 5% of the "genome". This means your population will be fixed incredibly quickly - https://en.wikipedia.org/wiki/Fixation_(population_genetics) .

This makes sense why the population isn't doing very well, because you are only exploring a very small area of the fitness landscape ( https://en.wikipedia.org/wiki/Fitness_landscape ) .

You should change the mutate function to mutate 5% of the genome (ie weights between nodes). Feel free to play around with the mutation rate as well - different problems perform better at different mutational rates.

If you are worried about loosing the current "best genome", a typical approach in evolutionary computation is to copy the individual with the highest fitness over to the next generation without mutation.

(Sorry, this probably should have been a comment but I don't have enough reputation).

like image 27
dzs757 Avatar answered Oct 11 '22 14:10

dzs757