Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Genetic Algorithm, large population vs small one

Im wondering if there is a general rule of thumb for population sizing. Ive read in a book that 2x the chromosome length is a good starting point. Am i correct in assuming then that if i had an equation with 5 variables, i should have a population of 10?

Im also wondering if the following is correct:

Larger Population Size.

Pros: Larger diversity so more likely to pick up on traits which return a good fitness.

Cons: Requires longer to process.

vs

Smaller Population Size.

Pros: Larger number of generations experienced per unit time.

Cons: Mutation will have to be more prominent in order to compensate for smaller population??

EDIT

A little additional info, say i have an equation which has 5 unknown parameters. For each parameter i have anywhere between 10-50 values i would like to try assign to each of these variables. So for example

variable1 = 20 different values variable2 = 15 different values ...

I thought a GA would be a decent approach to such a problem as the search space is quite large, ie worst case for the above would be 312,500,000 permutations (unless i have screwed up?) n!/(n-k)! where n = 50 and k = 1 => 50 * 50 * 50 * 50 * 50

unfortunately the number of parameters/range of values to check can vary alot so i was looking for some sort of rule of thumb as to how large i should set the population.

Thanks for ur help + if there is any more info you need/prefer to discuss in one of the chatrooms, just give me a shout.

like image 259
Hans Rudel Avatar asked Aug 07 '13 20:08

Hans Rudel


People also ask

What is a good population size for genetic algorithm?

A population size of 300 was found to be optimal for the convergence of the Genetic Algorithm-based program to the optimal solution.

Why population is necessary in genetic algorithm?

In Genetic Algorithm, the population size is an important parameter which directly influences the ability to search an optimum solution in the search space. Many researchers have revealed that having a large number of population leads to the accuracy of getting an optimal solution.

How many generations should a genetic algorithm have?

The number of generations is usually in the 100 to 10,000 range.

Are multiple runs of genetic algorithms better than one?

Mul- tiple runs would perform better than a single one if the quality degradation is not too pronounced. In fact, the tradeoff suggests that there is an optimal number of runs and population size that maximize the expected quality.


1 Answers

I'm not sure where you read that 2x the chromosome length is a good starting point, but I'm guessing it's a book that concentrated on larger problems.

If you only have five variables, a genetic algorithm is probably not the right choice for converging upon a solution. With a chromosome length of five you're probably going to find that you very quickly reach a non-deterministic(this will change in subsequent runs) local minimum and slowly iterate around that space until you find the true local minimum.

However, if you are insistent on using a GA I would suggest abandoning that rule of thumb for this problem and really think about starting population as a measure of how far from the final solution you expect a random solution to be.

The reason that many rule of thumbs is dependent on chromosome length is because that's a decent proxy for this, if I have a hundred variables, and given randomly generating dna sequence is going to be further from ideal than if I had only one variable.

Additionally, if you're worried about computation intensity I'm going to go ahead and say that it shouldn't be an issue since you're dealing with such a small solution set. I think a better rule of thumb for smaller sets like this would be along the lines of:

(ln(chromosome_length*(solution_space/granularity)/mutation_rate))^2

Probably with a constant thrown in to scale for the particular problem.

It's definitely not a great rule of thumb (no rule is) but here's my logic for it:

  • Chromosome length is just a proxy for size of solution space, so taking into account the size of the solution space will necessarily increase the accuracy of this proxy
  • A smaller mutation rate necessitates a larger population size to compensate for the fact that you are more prone to get caught in local minima
  • Any rule of thumb should scale logarithmically since a genetic algorithm is akin to a tree search of your solution space.
  • The squared term was mostly the result of trying this out, but it looks like the logarithmic scaling was a little aggressive, though the general shape seemed right.

However I think a better choice would be to start at a reasonable number (100) and try iterating up and down until you find a population size that seems to balance accuracy with execution speed.

like image 67
Slater Victoroff Avatar answered Sep 22 '22 12:09

Slater Victoroff