GEP Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

C. FERREIRA In J. M. Benitez, O. Cordon, F. Hoffmann, and R. Roy, eds., Advances in Soft Computing: Engineering Design and Manufacturing, pages 257-266, Springer-Verlag, 2003.

Function Finding and the Creation of Numerical Constants in Gene Expression Programming

Introduction
 
Genetic programming (GP) evolves computer programs by genetically modifying nonlinear entities with different sizes and shapes (Koza 1992). These nonlinear entities can be represented as diagrams or trees. Gene expression programming (GEP) is an extension to GP that also evolves computer programs of different sizes and shapes, but the programs are encoded in a linear chromosome of fixed length (Ferreira 2001). One strength of the GEP approach is that the creation of genetic diversity is extremely simplified as genetic operators work at the chromosome level. Indeed, due to the structural organization of GEP chromosomes, the implementation of high-performing search operators is extremely simplified, as any modification made in the genome always results in valid programs. Another strength of GEP consists of its unique, multigenic nature which allows the evolution of complex programs composed of several simpler sub-programs.

It is assumed that the creation of floating-point constants is necessary to do symbolic regression in general (see, e.g., Banzhaf 1994 and Koza 1992). Genetic programming solved the problem of constant creation by using a special terminal named “ephemeral random constant” (Koza 1992). For each ephemeral random constant used in the trees of the initial population, a random number of a special data type in a specified range is generated. Then these random constants are moved around from tree to tree by the crossover operator.

Gene expression programming solves the problem of constant creation differently (Ferreira 2001). GEP uses an extra terminal “?” and an extra domain Dc composed of the symbols chosen to represent the random constants. For each gene, the random constants are generated during the inception of the initial population and kept in an array. The values of each random constant are only assigned during gene expression. Furthermore, a special operator is used to introduce genetic variation in the available pool of random constants by mutating the random constants directly. In addition, the usual operators of GEP plus a Dc specific transposition guarantee the effective circulation of the numerical constants in the population. Indeed, with this scheme of constants manipulation, the appropriate diversity of numerical constants can be generated at the beginning of a run and maintained easily afterwards by the genetic operators.

Notwithstanding, in this work it is shown that evolutionary algorithms do symbolic regression more efficiently if the problem of constant creation is handled by the algorithm itself. In other words, the special facilities for manipulating random constants are indeed unnecessary to solve problems of symbolic regression.

Home | Contents | Previous | Next