Genetic Algorithm, large population vs small one

Tags:

Im wondering if there is a general rule of thumb for population sizing. Ive read in a book that 2x the chromosome length is a good starting point. Am i correct in assuming then that if i had an equation with 5 variables, i should have a population of 10?

Im also wondering if the following is correct:

Larger Population Size.

Pros: Larger diversity so more likely to pick up on traits which return a good fitness.

Cons: Requires longer to process.

Smaller Population Size.

Pros: Larger number of generations experienced per unit time.

Cons: Mutation will have to be more prominent in order to compensate for smaller population??

EDIT

A little additional info, say i have an equation which has 5 unknown parameters. For each parameter i have anywhere between 10-50 values i would like to try assign to each of these variables. So for example

variable1 = 20 different values variable2 = 15 different values ...

I thought a GA would be a decent approach to such a problem as the search space is quite large, ie worst case for the above would be 312,500,000 permutations (unless i have screwed up?) n!/(n-k)! where n = 50 and k = 1 => 50 * 50 * 50 * 50 * 50

unfortunately the number of parameters/range of values to check can vary alot so i was looking for some sort of rule of thumb as to how large i should set the population.

Thanks for ur help + if there is any more info you need/prefer to discuss in one of the chatrooms, just give me a shout.

259

asked Aug 07 '13 20:08

Hans Rudel

1 Answers

I'm not sure where you read that 2x the chromosome length is a good starting point, but I'm guessing it's a book that concentrated on larger problems.

If you only have five variables, a genetic algorithm is probably not the right choice for converging upon a solution. With a chromosome length of five you're probably going to find that you very quickly reach a non-deterministic(this will change in subsequent runs) local minimum and slowly iterate around that space until you find the true local minimum.

However, if you are insistent on using a GA I would suggest abandoning that rule of thumb for this problem and really think about starting population as a measure of how far from the final solution you expect a random solution to be.

The reason that many rule of thumbs is dependent on chromosome length is because that's a decent proxy for this, if I have a hundred variables, and given randomly generating dna sequence is going to be further from ideal than if I had only one variable.

Additionally, if you're worried about computation intensity I'm going to go ahead and say that it shouldn't be an issue since you're dealing with such a small solution set. I think a better rule of thumb for smaller sets like this would be along the lines of:

(ln(chromosome_length*(solution_space/granularity)/mutation_rate))^2

Probably with a constant thrown in to scale for the particular problem.

It's definitely not a great rule of thumb (no rule is) but here's my logic for it:

Chromosome length is just a proxy for size of solution space, so taking into account the size of the solution space will necessarily increase the accuracy of this proxy
A smaller mutation rate necessitates a larger population size to compensate for the fact that you are more prone to get caught in local minima
Any rule of thumb should scale logarithmically since a genetic algorithm is akin to a tree search of your solution space.
The squared term was mostly the result of trying this out, but it looks like the logarithmic scaling was a little aggressive, though the general shape seemed right.

However I think a better choice would be to start at a reasonable number (100) and try iterating up and down until you find a population size that seems to balance accuracy with execution speed.

answered Sep 22 '22 12:09

Slater Victoroff

Related questions
                            
                                how to run a pre-trained model in AWS sagemaker?
                            
                                What is the default batch size of pytorch SGD?
                            
                                Google colab pro GPU running extremely slow
                            
                                How can a genetic algorithm optimize a neural network's weights without knowing the search volume?
                            
                                When to use @tf.function decorator and when not? I know tf.function builds graph. But how to know when to build graphs?
                            
                                Can we make the ML model (pickle file) more robust, by accepting (or ignoring) new features?
                            
                                Python NEAT not learning further after a certain point
                            
                                How to rotate an image to align the text for extraction?
                            
                                What is the meaning of 'for _ in range() [duplicate]
                            
                                Calculating Nearest Match to Mean/Stddev Pair With LibSVM
                            
                                U-matrix and self organizing maps
                            
                                Is there a viable handwriting recognition library / program? [closed]
                            
                                Interpreting coefficient names in glmnet in R
                            
                                WEKA classification likelihood of the classes
                            
                                Combining Weak Learners into a Strong Classifier
                            
                                Classification with naiveBayes (e1071) does not work ($levels returns NULL)
                            
                                How to perform 10 fold cross validation with LibSVM in R?
                            
                                Python Scikit Random Forest Regressor Error
                            
                                How to implement a better sliding window algorithm?
                            
                                Correct ratio of positive to negative training examples for training a random forest-based binary classifier

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Genetic Algorithm, large population vs small one

Tags:

artificial-intelligence

machine-learning

genetic-algorithm

Hans Rudel

People also ask

1 Answers

Slater Victoroff

Recent Activity

Donate For Us