GEP Book

  Home
  News
  Author
  Q&A
  Tutorials
  Downloads
  GEP Biblio
  Contacts

  Visit Gepsoft

 

C. FERREIRA In N. Nedjah, L. de M. Mourelle, A. Abraham, eds., Genetic Systems Programming: Theory and Experiences, Studies in Computational Intelligence, Vol. 13, pp. 21-56, Springer-Verlag, 2006.

Automatically Defined Functions in Gene Expression Programming

The Unicellular System
 

For the unicellular system without random numerical constants, both the parameters and performance are shown in Table 3. It’s worth pointing out, that, in these experiments, we are dealing with Automatically Defined Functions that can be reused again and again, and therefore it makes little sense to talk about maximum program length. However, in these series of experiments the same head length of six was used to encode all the ADFs and a system with four ADFs was also analyzed, thus allowing the comparison of these cellular systems with the simpler acellular ones (recall that we used four genes with h = 6 in the multigenic system and one gene with h = 24 in the unigenic system).

Table 3
Settings and performance for the sextic polynomial problem using a unicellular system encoding 1, 2, 3, and 4 ADFs.

  1 ADF 2 ADFs 3 ADFs 4 ADFs
Number of runs 100 100 100 100
Number of generations 200 200 200 200
Population size 50 50 50 50
Chromosome length 22 35 48 61
Number of genes/ADFs 1 2 3 4
Head length 6 6 6 6
Gene length 13 13 13 13
Function set of ADFs + - * / + - * / + - * / + - * /
Terminal set a a a a
Number of homeotic genes/cells 1 1 1 1
Head length of homeotic genes 4 4 4 4
Length of homeotic genes 9 9 9 9
Function set of homeotic genes + - * / + - * / + - * / + - * /
Terminal set of homeotic genes ADF 0 ADFs 0-1 ADFs 0-2 ADFs 0-3
Mutation rate 0.044 0.044 0.044 0.044
Inversion rate 0.1 0.1 0.1 0.1
RIS transposition rate 0.1 0.1 0.1 0.1
IS transposition rate 0.1 0.1 0.1 0.1
Two-point recombination rate 0.3 0.3 0.3 0.3
One-point recombination rate 0.3 0.3 0.3 0.3
Gene recombination rate 0.3 0.3 0.3 0.3
Gene transposition rate -- 0.1 0.1 0.1
Mutation rate in homeotic genes 0.044 0.044 0.044 0.044
Inversion rate in homeotic genes 0.1 0.1 0.1 0.1
RIS transposition rate in homeotic genes 0.1 0.1 0.1 0.1
IS transposition rate in homeotic genes 0.1 0.1 0.1 0.1
Number of fitness cases 50 50 50 50
Selection range 100 100 100 100
Precision 0.01 0.01 0.01 0.01
Success rate 82% 78% 69% 63%


As you can see in Table 3, these cellular systems with Automatically Defined Functions perform extremely well, especially if we compare them with the unigenic system (see Table 1), which, as you recall, is the closer we can get to the Genetic Programming system. So, we can also conclude that the use of ADFs can bring considerable benefits to systems limited to just one gene or parse tree, especially if there is some modularity in the problem at hand. Note, however, that the unicellular system slightly pales in comparison to the multigenic system with static linking (see Table 2), showing that the multigenic system of GEP is already a highly efficient system that can be used to solve virtually any kind of problem.

It is worth pointing out that, in his analysis of the role of ADFs to solve this problem, Koza uses just one ADF, which, as you can see in Table 3, is the most successful organization to solve this problem, with a success rate of 82%. And, curiously enough, this same pattern appears in all the experiments conducted here, in which the highest success rates were obtained when just one ADF was used (see Tables 3, 4, 5, and 6).

Let’s take a look at the structure of the first perfect solution found using the unicellular system encoding just one ADF:

0123456789012012345678  

-a**aaaaaaaaa*/0*00000

(20)

As its expression reveals, the main program is far from parsimonious and could be simplified to (ADF)2. But, nevertheless, it illustrates perfectly how a useful building block created by the evolutionary process itself can be reused several times by the main program encoded in the homeotic gene.

Let’s also analyze the structure of a program with more than one ADF, the individual below with four ADFs (genes are shown separately):

0123456789012

 

*-a--+aaaaaaa

 

/-a---aaaaaaa

 

-a*+*-aaaaaaa

 

a+*-+aaaaaaaa

 

*22121133

(21)

As its expression shows, the main program is fairly compact and invokes only ADF2. All the remaining ADFs are not used at all, and, therefore, are free to evolve without much pressure. We know already that neutral regions play an important role both in natural evolution (Kimura 1983) and in GEP (Ferreira 2002b), and that their use in good measure is responsible for a considerable increase in performance. And the same phenomenon is observed in these cellular systems, where the simplest one with only one ADF and a single cell seems to have the right amount of neutral regions as it evolves very efficiently with a success rate of 82%. So, in this particular problem, by increasing the number of ADFs, we are basically increasing the number of neutral regions, and the performance for this simple modular problem decreases proportionately, dropping down to 63% in the system with four ADFs (see Table 3).

Let’s now analyze the behavior of the unicellular system when random numerical constants are also incorporated in the Automatically Defined Functions.

For that purpose a similar set of experiments were done, using also 1, 2, 3, and 4 ADFs (Table 4). And as expected, a considerable decrease in performance was observed comparatively to the performance observed in the unicellular system without random numerical constants (see Table 3).

Table 4
Settings and performance for the sextic polynomial problem using a unicellular system encoding 1, 2, 3, and 4 ADFs with random numerical constants.

  1 ADF 2 ADFs 3 ADFs 4 ADFs
Number of runs 100 100 100 100
Number of generations 200 200 200 200
Population size 50 50 50 50
Chromosome length 29 49 69 89
Number of genes/ADFs 1 2 3 4
Head length 6 6 6 6
Gene length 20 20 20 20
Function set of ADFs + - * / + - * / + - * / + - * /
Terminal set a ? a ? a ? a ?
Number of homeotic genes/cells 1 1 1 1
Head length of homeotic genes 4 4 4 4
Length of homeotic genes 9 9 9 9
Function set of homeotic genes + - * / + - * / + - * / + - * /
Terminal set of homeotic genes ADF 0 ADFs 0-1 ADFs 0-2 ADFs 0-3
Mutation rate 0.044 0.044 0.044 0.044
Inversion rate 0.1 0.1 0.1 0.1
RIS transposition rate 0.1 0.1 0.1 0.1
IS transposition rate 0.1 0.1 0.1 0.1
Two-point recombination rate 0.3 0.3 0.3 0.3
One-point recombination rate 0.3 0.3 0.3 0.3
Gene recombination rate 0.3 0.3 0.3 0.3
Gene transposition rate -- 0.1 0.1 0.1
Random constants per gene 5 5 5 5
Random constants data type Integer Integer Integer Integer
Random constants range 0-3 0-3 0-3 0-3
Dc-specific mutation rate 0.044 0.044 0.044 0.044
Dc-specific inversion rate 0.1 0.1 0.1 0.1
Dc-specific IS transposition rate 0.1 0.1 0.1 0.1
Random constants mutation rate 0.01 0.01 0.01 0.01
Mutation rate in homeotic genes 0.044 0.044 0.044 0.044
Inversion rate in homeotic genes 0.1 0.1 0.1 0.1
RIS transposition rate in homeotic genes 0.1 0.1 0.1 0.1
IS transposition rate in homeotic genes 0.1 0.1 0.1 0.1
Number of fitness cases 50 50 50 50
Selection range 100 100 100 100
Precision 0.01 0.01 0.01 0.01
Success rate 57% 42% 33% 26%


Let’s take a look at the structure of the first perfect solution found using the unicellular system encoding just one ADF with random numerical constants:

01234567890123456789012345678

 

---*a-?aa?a??0412201*+0*00000

 
   

A = {1, 0, 3, 1, 2}

(22)

As its expression shows, the simple module discovered in the structure encoding the ADF (a2-1) is called four times in the main program, creating a perfect solution with just one kind of building block.

Let’s now analyze the structure of a program with more than one ADF, the individual below with four ADFs with random numerical constants (the genes are shown separately):

01234567890123456789

 

*-+*a*?aaa?a?3324010

 

-aa?*-aa?a???3123440

 

*a/a-a?aa?a??2234201

 

--*+*+??aa?aa0141122

 

*00003233

 
   

A0 = {1, 1, 2, 1, 1}

 

A1 = {2, 3, 0, 2, 0}

 

A2 = {3, 1, 0, 0, 3}

 

A3 = {2, 0, 3, 2, 2}

(23)

As its expression reveals, the main program is fairly compact and invokes only ADF0. Indeed, for this simple modular problem, most perfect solutions involve just one ADF, suggesting that this problem is better solved using just one kind of building block that can then be used as many times as necessary by the main program. And the fact that the system evolves more efficiently with just one ADF is just another indication of this (57% success rate in the system with just one ADF versus 26% in the system with four ADFs).

Home | Contents | Previous | Next