For this analysis we are going to use again both the basic
Gene Expression Algorithm without random constants and GEP with
random numerical constants. The parameters and the performance of
both experiments are shown in Table 2.
Table 2
Settings for the sextic polynomial problem using a multigenic system
with (mgGEP-RNC) and without random numerical constants (mgGEP).
|
mgGEP |
mgGEP-RNC |
Number of runs |
100 |
100 |
Number of generations |
200 |
200 |
Population size |
50 |
50 |
Chromosome length |
52 |
80 |
Number of genes |
4 |
4 |
Head length |
6 |
6 |
Gene length |
13 |
20 |
Linking function |
* |
* |
Terminal set |
a |
a ? |
Function set |
+ - * / |
+ - * / |
Mutation rate |
0.044 |
0.044 |
Inversion rate |
0.1 |
0.1 |
RIS transposition rate |
0.1 |
0.1 |
IS transposition rate |
0.1 |
0.1 |
Two-point recombination rate |
0.3 |
0.3 |
One-point recombination rate |
0.3 |
0.3 |
Gene recombination rate |
0.3 |
0.3 |
Gene transposition rate |
0.1 |
0.1 |
Random constants per gene |
-- |
5 |
Random constants data type |
-- |
Integer |
Random constants range |
-- |
0-3 |
Dc-specific mutation rate |
-- |
0.044 |
Dc-specific inversion rate |
-- |
0.1 |
Dc-specific IS transposition
rate |
-- |
0.1 |
Random constants mutation rate |
-- |
0.01 |
Number of fitness cases |
50 |
50 |
Selection range |
100 |
100 |
Precision |
0.01 |
0.01 |
Success rate |
93% |
49% |
It’s worth pointing out that maximum program length in these
experiments is similar to the one used in the unigenic systems of
the previous section. Here, head lengths h = 6 and four genes
per chromosome were used, giving maximum program length of 52 points
(again note that the chromosome length in the systems with random
numerical constants is larger on account of the Dc domain, but
maximum program length remains the same).
As you can see by comparing Tables 1
and 2, the use of multiple genes resulted in a
considerable increase in performance for both systems. In the
systems without random constants, by partitioning the genome into
four autonomous genes, the performance increased from 26% to 93%,
whereas in the systems with random numerical constants, the
performance increased from 4% to 49%. Note also that, in this
analysis, the already familiar pattern is observed when random
numerical constants are introduced: the success rate decreases
considerably from 93% to 49% (in the unigenic systems it decreased
from 26% to 4%).
Let’s also take a look at the structure of the first perfect
solution found using the multigenic system without the facility for
the manipulation of random numerical constants (the sub-ETs are
linked by multiplication):
0123456789012012345678901201234567890120123456789012 |
|
+/aaa/aaaaaaa+//a/aaaaaaaa-**a/aaaaaaaa-**a/aaaaaaaa |
(18) |
As its expression shows, it contains three small neutral
regions involving a total of nine nodes, all encoding the numerical
constant 1. Note also that, in two occasions (in sub-ETs 0 and 1),
the numerical constant 1 plays an important role in the overall
making of the perfect solution. Also interesting about this perfect
solution, is that genes 2 and 3 are exactly the same, suggesting a
major event of gene duplication (it’s worth pointing out that the
duplication of genes can only be achieved by the concerting action
of gene recombination and gene transposition, as a gene duplication
operator is not part of the genetic modification arsenal of Gene
Expression Programming).
It is also interesting to take a look at the structure of the first
perfect solution found using the multigenic system with the facility
for the manipulation of random numerical constants (the sub-ETs are
linked by multiplication):
01234567890123456789 |
|
+--+*aa??aa?a0444212 |
|
+--+*aa??aa?a0244422 |
|
a?a??a?aaa?a?2212021 |
|
aa-a*/?aa????3202123 |
|
|
|
A0 = {0, 3, 1,
2, 1} |
|
A1 = {0, 3, 1,
2, 1} |
|
A2 = {0, 3, 1,
2, 1} |
|
A3 = {3, 3, 2,
0, 2} |
(19) |
As its expression reveals, it is a fairly compact solution
with two small neutral motifs plus a couple of neutral nodes, all
representing the numerical constant zero. Note that genes 0 and 1
are almost exact copies of one another (there is only variation at
positions 17 and 18, but they are of no consequence as they are part
of a noncoding region of the gene), suggesting a recent event of
gene duplication. Note also that although genes 2 and 3 encode
exactly the same sub-ET (a simple sub-ET with just one node), they
most certainly followed different evolutionary paths as the homology
between their sequences suggests.
|