In this section:
In this section, we will analyze and compare two different approaches to solve the problem of constant creation in symbolic regression. The first manipulates explicitly random constants and the second solves the problem of constant creation by creating numerical constants from scratch or by inventing new ways of representing them.
It is assumed that the creation of floating-point constants is necessary to do symbolic regression in evolutionary computation (see, e.g.,
Koza 1992 and Banzhaf
1994). Genetic programming solves the problem of constant creation by using a special terminal named “ephemeral random constant” (Koza 1992). For each ephemeral random constant used in the trees of the initial population, a random number of a special data type in a specified range is generated. Then these random constants are moved around from tree to tree by the crossover operator.
Gene expression programming solves the problem of constant creation differently
(Ferreira 2001). GEP uses an extra terminal “?” and an extra domain Dc composed of the symbols chosen to represent the random constants. For each gene, the random constants are generated during the inception of the initial population and kept in an array. The values of each random constant are only assigned during gene expression. Furthermore, a special operator is used to introduce genetic variation in the available pool of random constants by mutating the random constants directly. In addition, the usual operators of GEP plus a Dc specific transposition guarantee the effective circulation of the random constants in the population. Indeed, with this scheme of constants manipulation, the appropriate diversity of random constants can be generated in the beginning of a run and maintained easily afterwards by the genetic operators.
Notwithstanding, we will see that evolutionary algorithms do symbolic regression more efficiently if the problem of constant creation is handled by the algorithm itself; in other words, the special facilities for the manipulation of random constants are indeed unnecessary to solve
most problems of symbolic regression. However, random constants are fundamental for parameter optimization and for evolving more complex structures such as GEP designed neural networks and decision trees with numerical attributes.
|