Something pretty annoying in evolutionary computing is that mildly different and overlapping concepts tend to pick dramatically different names. My latest confusion because of this is that gene-expression-programming seems very similar to cartesian-genetic-programming.
Well, it seems that there is some difference between gene expression programming (GEP) and cartesian genetic programming (CGP or what I view as classic genetic programming), but the difference might be more hyped up than it really ought to be. Please note that I have never used GEP, so all of my comments are based on my experience with CGP.
In CGP there is no distinction between genotype and a phenotype, in other words- if you're looking at the "genes" of a CGP you're also looking at their expression. There is no encoding here, i.e. the expression tree is the gene itself.
In GEP the genotype is expressed into a phenotype, so if you're looking at the genes you will not readily know what the expression is going to look like. The "inventor" of GP, Cândida Ferreira, has written a really good paper and there are some other resources which try to give a shorter overview of the whole concept.
Ferriera says that the benefits are "obvious," but I really don't see anything that would necessarily make GEP better than CGP. Apparently GEP is multigenic, which means that multiple genes are involved in the expression of a trait (i.e. an expression tree). In any case, the fitness is calculated on the expressed tree, so it doesn't seem like GEP is doing anything to increase the fitness. What the author claims is that GEP increases the speed at which the fitness is reached (i.e. in fewer generations), but frankly speaking you can see dramatic performance shifts from a CGP just by having a different selection algorithm, a different tournament structure, splitting the population into tribes, migrating specimens between tribes, including diversity into the fitness, etc.
Selection:
Tournament Frequency:
Tournament Structure:
Tribes
A population can be split into tribes that evolve independently of each-other:
Diversity Fitness
Incorporate diversity into the fitness, where you count how many individuals have the same fitness value (thus are likely to have the same phenotype) and you penalize their fitness by a proportionate value: the more individuals with the same fitness value, the more penalty for those individuals. This way specimens with unique phenotypes will be encouraged, therefore there will be much less stagnation of the population.
Those are just some of the things that can greatly affect the performance of a CGP, and when I say greatly I mean that it's in the same order or greater than Ferriera's performance. So if Ferriera didn't tinker with those ideas too much, then she could have seen much slower performance of the CGPs... especially if she didn't do anything to combat stagnation. So I would be careful when reading performance statistics on GEP, because sometimes people fail to account for all of the "optimizations" available out there.
There seems to be some confusion in these answers that must be clarified. Cartesian GP is different from classic GP (aka tree-based GP), and GEP. Even though they share many concepts and take inspiration from the same biological mechanisms, the representation of the individuals (the solutions) varies.
In CGPthe representation (mapping between genotype and phenotype) is indirect, in other words, not all of the genes in a CGP genome will be expressed in the phenome (a concept also found in GEP and many others). The genotypes can be coded in a grid or array of nodes, and the resulting program graph is the expression of active nodes only.
In GEP the representation is also indirect, and similarly not all genes will be expressed in the phenotype. The representation in this case is much different from treeGP or CGP, but the genotypes are also expressed into a program tree. In my opinion GEP is a more elegant representation, easier to implement, but also suffers from some defects like: you have to find the appropriate tail and head size which is problem specific, the mnltigenic version is a bit of a forced glue between expression trees, and finally it has too much bloat.
Independently of which representation may be better than the other in some specific problem domain, they are general purpose, can be applied to any domain as long as you can encode it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With