A few of my lab mates have been playing around cross-entropy reinforcement learning. From everything I can gather from them and quick internet searches, the cross-entropy method seems nearly identical to genetic algorithms. Can someone explain to me what the real difference between these two techniques is if one actually exists?
In this context, cross-entropy is one particular form of a genetic algorithm. Its a much more specific thing than saying "Genetic Algorithms" as that covers a huge number of different algorithms.
Put simply:
Genetic Algorithms is a family of algorithms / one type of approach to optimization
Cross-entropy is a specific genetic algorithm.
Both methods work with a population that is improved over many generations. The key difference is how the population is represented.
Genetic Algorithms (GAs) work on individuals of the population, e.g. through mutation. You can enumerate the ancestors of each individual.
The Cross-Entropy Method (CEM) represents the population as a probability distribution. Individuals are drawn from this distribution. The distribution parameters are re-estimated from the best e.g. 2% and the other 98% are discarded.
Technically, "the best 2%" is also a probability distribution. You could draw a very large sample from it, but it's expensive. So you try to approximate "the 2% distribution" with your simple distribution. The cross-entropy measures the difference between two distributions, which you want to minimize. Often this is simpler than it sounds: if your distribution is a Gaussian you can just estimate a new mean and (co-)variance from the best 2%.
Practical considereations:
The CEM requires you to come up with a probability distribution over individuals. Most GAs also require such a distribution only to generate the initial population, in addition to parameters like mutation strength.
The CEM is simple to implement and has few parameters. It is a great baseline algorithm, and occasionally it beats methods that are much more sophisticated. However CMA-ES is a better baseline for continuous problems with just a few hundred parameters, because of its strong track record.
Estimating parameters from just 2% of the population requires a very large population size. Throwing away 98% of the information is wasteful. On the other hand, it prevents the CEM from stepping "sideways" and getting distracted by sub-optimal solutions.
GAs can be much more fancy and exist in many problem-specific variations. The CEM can be adapted to a problem by choosing a clever distribution. This works great for some discrete problems. Generally I'd say using a GA is one step up from the CEM, both in complexity (harder to get it working right) and in terms of potential performance (more opportunities to adapt its operators to the problem).
References:
The Cross-Entropy Method for Fast Policy Search (Mannor et al, 2003)
Learning Tetris Using the Noisy Cross-Entropy Method (PDF) (Szita and Lörincz, 2006)
The Cross-Entropy Method for Optimization (Botev et al., 2013)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With