The comparison between the two approaches (with and without the facility to manipulate random constants) was made on three different problems. The first is a problem of sequence induction requiring integer constants. In this case the following test sequence was chosen:
an = 5n4 + 4n3 +
3n2 + 2n + 1 |
(3.1) |
where n consists of the nonnegative integers. This sequence was chosen because it can be exactly solved and therefore can provide an accurate measure of performance in terms of success rate.
The second is a problem of function finding requiring floating-point constants. In this case, the following “V” shaped function was chosen:
y = 4.251a2 + ln(a2) +
7.243ea |
(3.2) |
where a is the independent variable and e is the irrational number 2.71828183. Problems of this kind cannot be exactly solved by evolutionary algorithms and, therefore, the performance of both approaches is compared in terms of average best-of-run fitness and average best-of-run R-square.
The third is the well-studied benchmark problem of predicting sunspots
(Weigend et al. 1992). In this case, 100 observations of the Wolfer sunspots series were used
(Table 1) with an embedding dimension of 10 and a delay time of one. Again, the performance of both approaches is compared in terms of average best-of-run fitness and R-square.
Table 1
Wolfer sunspots series (read by rows).
101 |
82 |
66 |
35 |
31 |
7 |
20 |
92 |
154 |
125 |
85 |
68 |
38 |
23 |
10 |
24 |
83 |
132 |
131 |
118 |
90 |
67 |
60 |
47 |
41 |
21 |
16 |
6 |
4 |
7 |
14 |
34 |
45 |
43 |
48 |
42 |
28 |
10 |
8 |
2 |
0 |
1 |
5 |
12 |
14 |
35 |
46 |
41 |
30 |
24 |
16 |
7 |
4 |
2 |
8 |
17 |
36 |
50 |
62 |
67 |
71 |
48 |
28 |
8 |
13 |
57 |
122 |
138 |
103 |
86 |
63 |
37 |
24 |
11 |
15 |
40 |
62 |
98 |
124 |
96 |
66 |
64 |
54 |
39 |
21 |
7 |
4 |
23 |
55 |
94 |
96 |
77 |
59 |
44 |
47 |
30 |
16 |
7 |
37 |
74 |
|
|
|
|
For the sequence induction problem, the first 10 positive integers n and their corresponding term
an were used as fitness cases. The fitness function was based on the relative error with a selection range of 20% and maximum precision (0% error), giving maximum fitness
fmax = 200 (Ferreira 2001).
For the “V” shaped function problem, a set of 20 random fitness cases chosen from the interval [-1, 1] was used. The fitness function used was also based on the relative error but in this case a selection range of 100% was used, giving
fmax = 2,000.
For the time series prediction problem, using an embedding dimension of 10 and a delay time of one, the sunspots series presented in
Table 1 result in 90 fitness cases. In this case, a wider selection range of 1,000% was chosen, giving
fmax = 90,000.
In all the experiments, the selection was made by roulette-wheel sampling coupled with simple elitism and the performance was evaluated over 100 independent runs. The six experiments are summarized in
Table 2.
Table 2
General settings used in the sequence induction (SI), the “V” function, and sunspots (SS) problems. The “*” indicates the explicit use of random constants.
|
SI* |
SI |
V* |
V |
SS* |
SS |
Number
of runs |
100 |
100 |
100 |
100 |
100 |
100 |
Number
of generations |
100 |
100 |
5000 |
5000 |
5000 |
5000 |
Population
size |
100 |
100 |
100 |
100 |
100 |
100 |
Number
of fitness cases |
10 |
10 |
20 |
20 |
90 |
90 |
Function
set |
+
- * / |
+
- * / |
+
- * / L E K ~ S C |
+
- * / L E K ~ S C |
4
(+ - * /) |
4
(+ - * /) |
Terminal
set |
a, ? |
a |
a, ? |
a |
a -
j, ? |
a - j |
Random
constants array length |
10 |
-- |
10 |
-- |
10 |
-- |
Random
constants range |
{0,
1, 2, 3} |
-- |
[-1,1] |
-- |
[-1,1] |
-- |
Head
length |
6 |
6 |
6 |
6 |
8 |
8 |
Number
of genes |
7 |
7 |
5 |
5 |
3 |
3 |
Linking
function |
+ |
+ |
+ |
+ |
+ |
+ |
Chromosome
length |
140 |
91 |
100 |
65 |
78 |
51 |
Mutation
rate |
0.044 |
0.044 |
0.044 |
0.044 |
0.044 |
0.044 |
One-point
recombination rate |
0.3 |
0.3 |
0.3 |
0.3 |
0.3 |
0.3 |
Two-point
recombination rate |
0.3 |
0.3 |
0.3 |
0.3 |
0.3 |
0.3 |
Gene
recombination rate |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
IS
transposition rate |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
IS
elements length |
1,2,3 |
1,2,3 |
1,2,3 |
1,2,3 |
1,2,3 |
1,2,3 |
RIS
transposition rate |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
RIS
elements length |
1,2,3 |
1,2,3 |
1,2,3 |
1,2,3 |
1,2,3 |
1,2,3 |
Gene
transposition rate |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
0.1 |
Random
constants mutation rate |
0.01 |
-- |
0.01 |
-- |
0.01 |
-- |
Dc
specific transposition rate |
0.1 |
-- |
0.1 |
-- |
0.1 |
-- |
Dc
specific IS elements length |
1,2,3 |
-- |
1,2,3 |
-- |
1,2,3 |
-- |
Selection
range |
20% |
20% |
100% |
100% |
1000% |
1000% |
Precision |
0% |
0% |
0% |
0% |
0% |
0% |
Average
best-of-run fitness |
179.827 |
197.232 |
1914.8 |
1931.84 |
86215.27 |
89033.29 |
Average
best-of-run R-square |
0.977612 |
0.999345 |
0.957255 |
0.995340 |
0.713365 |
0.811863 |
Success
rate |
16% |
81% |
-- |
-- |
-- |
-- |
|