Gene Expression Programming was invented by myself in 1999 (Ferreira
2001), and incorporates both the simple, linear chromosomes of
fixed length similar to the ones used in Genetic Algorithms and the
ramified structures of different sizes and shapes similar to the
parse trees of Genetic Programming. And since the ramified
structures of different sizes and shapes are totally encoded in the
linear chromosomes of fixed length, this is equivalent to say that,
in GEP, the genotype and phenotype are finally separated from one
another and the system can now benefit from all the evolutionary
advantages this brings about.
Thus, the phenotype of GEP consists of the same kind of ramified
structure used in GP. But the ramified structures evolved by GEP
(called expression trees) are the expression of a totally autonomous
genome. Therefore, with GEP, a remarkable thing happened: the second
evolutionary threshold – the phenotype threshold – was crossed (Dawkins 1995).
And this means that only the genome (slightly modified) is passed on
to the next generation. Consequently, one no longer needs to
replicate and mutate rather cumbersome structures as all the
modifications take place in a simple linear structure which only
later will grow into an expression tree.
The fundamental steps of Gene Expression Programming are
schematically represented in
Figure 5. The process begins
with the random generation of the chromosomes of a certain number of
individuals (the initial population). Then these chromosomes are
expressed and the fitness of each individual is evaluated against a
set of fitness cases (also called selection environment). The
individuals are then selected according to their fitness (their
performance in that particular environment) to reproduce with
modification, leaving progeny with new traits. These new individuals
are, in their turn, subjected to the same developmental process:
expression of the genomes, confrontation of the selection
environment, selection according to fitness, and reproduction with
modification. The process is repeated for a certain number of
generations or until a good solution has been found.
Figure 5. The flowchart of Gene Expression Programming.
So, the pivotal insight of Gene Expression Programming consisted
in the invention of chromosomes capable of representing any parse
tree. For that purpose a new language – Karva language – was created
in order to read and express the information encoded in the
chromosomes. The details of this new language are given in the
next section.
Furthermore, the structure of the chromosomes was designed in order
to allow the creation of multiple genes, each coding for a smaller
program or sub-expression tree. It is worth emphasizing that Gene
Expression Programming is the only genetic algorithm with multiple
genes. Indeed, in truly functional genotype/phenotype systems, the
creation of more complex individuals composed of multiple genes is a
child’s play, and illustrates quite well the great versatility of
the GEP system. In fact, after their inception, these systems seem
to catapult themselves into higher levels of complexity such as the
uni- and multicellular systems, where different cells put together
different combinations of genes (Ferreira 2002a).
We will see later in this chapter how the cellular system of GEP is
an extremely elegant way of implementing Automatically Defined
Functions that may be reused by the created programs.
The basis for all this novelty resides on the simple, yet
revolutionary structure of GEP genes. This structure not only allows
the encoding of any conceivable program but also allows an efficient
evolution. This versatile structural organization also allows the
implementation of a very powerful set of genetic operators which can
then very efficiently search the solution space. As in nature, the
search operators of GEP always generate valid structures and
therefore are remarkably suited to creating genetic diversity.
|