GEP Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

C. FERREIRA Invited Tutorial Presented at WSC6, 2001

Gene Expression Programming in Problem Solving

GEP Genes
 

GEP genes are composed of a head and a tail. The head contains symbols that represent both functions and terminals, whereas the tail contains only terminals. For each problem, the length of the head h is chosen, whereas the length of the tail t is a function of h and the number of arguments of the function with more arguments n, and is evaluated by the equation:

t = h (n-1) + 1

(2.4)

Consider a gene for which the set of functions F = {Q, *, /, -, +} and the set of terminals T = {a, b}. In this case, n = 2; and if we chose an h = 15, then t = 16. Thus, the length of the gene g is 15 + 16 = 31. One such gene is shown below (the tail is shown in blue):

0123456789012345678901234567890

/aQ/b*ab/Qa*b*-ababaababbabbbba

(2.5)

It codes for the following ET:

In this case, the ORF ends at position 7, whereas the gene ends at position 30.

Suppose now a mutation occurred at position 2, changing the ‘Q’ into ‘+’. Then the following gene is obtained:

0123456789012345678901234567890

/a+/b*ab/Qa*b*-ababaababbabbbba

(2.6)

And its expression gives:

In this case, the termination point shifts 10 positions to the right (position 17).

Obviously the opposite might also happen, and the ORF is shortened. For example, consider again gene 2.5 above, and suppose a mutation occurred at position 5, changing the ‘*’ into ‘b’:

0123456789012345678901234567890

/aQ/bbab/Qa*b*-ababaababbabbbba

(2.7)

Its expression results in the following ET:

In this case, the ORF ends at position 5, shortening the parental ET in 2 nodes.

So, despite its fixed length, each gene has the potential to code for ETs of different sizes and shapes, being the simplest composed of only one node (when the first element of a gene is a terminal) and the biggest composed of as many nodes as the length of the gene (when all the elements of the head are functions with the maximum number of arguments).

It is evident from the examples above, that any modification made in the genome, no matter how profound, results always in a structurally correct ET. The only thing we must be careful about, is in not disrupting the structural organization of genes, maintaining always the boundaries between head and tail and not allowing symbols representing functions on the tail. We will pursue these matters further in section 2.3 where the mechanisms and effects of different genetic operators are thoroughly analyzed.

Home | Contents | Previous | Next