GEP Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

C. FERREIRA Invited Tutorial Presented at WSC6, 2001

Gene Expression Programming in Problem Solving

Genetic Programming
 

Genetic programming, invented by Cramer in 1985 (Cramer 1985) and further developed by Koza (1992), solved the problem of fixed length solutions by creating non-linear entities with different sizes and shapes. The alphabet used to create these entities was also more varied, creating a richer, more versatile system of representation. However, the created individuals lacked a simple, autonomous genome, functioning simultaneously both as genome and phenome. Again, in the jargon of evolutionary theory, the entities of GP are simple replicators that survive by virtue of their own properties. The non-linear entities (parse trees) of GP resemble protein molecules in their use of a richer alphabet and in their complex, hierarchical representation. Thus, GP entities are capable of exhibiting a great variety of functionalities. But these entities are very difficult to reproduce with modification because the genetic modifications are done directly on the parse tree itself. Consequently, most modifications generate structural impossibilities. As a comparison, it is worth noticing that, in nature, the expression of any protein gene results always in a valid protein structure (in nature, there is no such thing as a structurally incorrect protein).

So, in GP, the genetic operators act directly on the parse tree and, although at first sight this might appear advantageous, it greatly limits this technique (it is impossible to make an orange tree produce mangos only by grafting and pruning). Furthermore, the pallet of genetic operators available to GP is very limited, because most of them would result in invalid parse trees. Consequently, GP uses almost exclusively a special kind of recombination that operates at the level of parse trees. In this GP-specific crossover, selected branches are exchanged between two parent parse trees to create offspring. The idea behind its implementation was to exchange smaller, mathematically concise blocks in order to evolve more complex, hierarchical solutions composed of smaller building blocks.

The mutation operator in GP also differs from point mutations in nature in order to guarantee the creation of syntactically correct programs. The mutation operator selects a node in the parse tree and replaces the branch beneath that node by a randomly generated branch. Again, the overall shape of the tree is not greatly changed by this kind of mutation.

Permutation is the third operator used in GP and, like recombination and mutation, is greatly constrained: two structurally equivalent nodes (two terminals or two functions with the same number of arguments) are chosen and their positions are exchanged. In this case the overall shape of the tree remains unchanged.

Although J. Koza described these three operators as the basic GP operators, crossover is practically the only genetic operator used in most GP implementations. Not surprisingly, in GP, huge populations of parse trees are used with the aim of creating all the necessary building blocks with the inception of the initial population in order to guarantee the discovery of a solution only by moving these initial building blocks around.

Finally, due to the dual function of the parse trees (genome and phenome), and like GAs, GP is incapable of a simple, rudimentary expression: in all cases, the entire parse tree is the solution.

Home | Contents | Previous | Next