Q&A from Personal Correspondence
These are answers to questions I've
received over the years from people interested in Gene Expression
Programming. It’s been a long journey and I’ve made quite a few friends
and acquaintances along the way. I’ve enjoyed every single email and I
did my best to be helpful and encouraging to all of you. I hope I have
succeeded. Thank you all for contacting me and also for your
encouragement! I believe these Q&A might be useful to others, hence my
decision to post them here. I hope you’ll be able to find something
interesting or new in them. As always, for privacy sake, I removed any
references that could identify the people involved.
I very much like the thought of designing a language
where every ordering of the most basic units is a valid program. This is
the spirit of Karva, right?
I think this is fundamental. Of course systems that produce invalid
programs also work, but they are very inefficient. As I said in my
Complex Systems paper, in nature there is no such thing as an
invalid protein as all full-fledged genotype/phenotype systems have
syntactic closure.
Still about syntactic closure. I think it is ok for
programs to not always work out. I have mixed feelings about the need
for head and tail sections for GEP. If there are too many operators,
say, then couldn't we just let the gene have a low fitness? (Not for GEP
as in your
paper, no, there I think it is indeed nice). Can you see what I mean
when talking about the difference between the structure of the language
leading to invalid evaluation and say a particular gene not having a
sufficient variable to operator ratio?
Of course you could remove the invalid programs from the population, but
I’m afraid they would be the vast majority and lots of resources would
be lost in that. I don’t think it is a good idea to have them around; if
they must exist, better exclude them so that they don’t spread in the
population. For example, whenever a program gives a calculation error
for a particular fitness case (say, a division by zero), I simply
exclude this program from the population (contrary to GP, I don’t use
protected divisions or protected square roots or logarithms... and keep
these problematic programs around in the population: They are unfit and,
as such, their fitness is zero which gives them zero chance of leaving
offspring). My experience is that even carrying these programs around is
not beneficial in evolutionary terms; and carrying structurally
defective programs is a total waste of time and resources. So, if there
is a way to produce always syntactically correct programs, we must use
it.
I would like to know the differences between GP and GEP
if the mapping from genotypes (where a GEP operates) to phenotypes
(where a GP operates) is injective. It might seem that technically they
are the same thing by clear rules of transitivity.
It is true that in GEP one genotype corresponds to one phenotype and
vice versa. Maybe in mathematics this will be the end of the story, but
not in biology or in evolutionary computation. When the only thing you
have is a phenotype, there are lots of places on the search landscape
that are forever inaccessible. On the other hand, in genuine
genotype-phenotype systems the genotype allows the unconstrained
exploration of the solution space. What I’ve been saying about grafting
and pruning clearly illustrates this: It is
impossible to make an orange tree produce mangos only by grafting and
pruning. And grafting and pruning is the best a GP system can do.
Using terms like Gene Expression Programming or
Developmental Genetic Programming, I suppose the emphasis should always
be on the mapping. The mapping is all I have considered. This mapping is
analogous to transcription as we want to produce a structure (i.e. mRNA)
that describes a program given a genome. The mapping that codes a unit
of mRNA from parts of the genome is called the genetic code, if I am not
mistaken.
In biology one usually represents the information flow from DNA to
proteins by DNA -> RNA -> Protein. The first phase (DNA -> RNA) is what
is called transcription and requires not a genetic code but some simple
complementary rules between nucleotides. The genetic code is only used
to express an mRNA into a protein, that is, in the second phase (RNA ->
Protein). This second phase is called translation, and here we have
indeed the real mapping from the genotype (mRNA is nothing more than a
working copy of a gene) to the phenotype (protein).
This genetic code is one possible focus of gene
expression. Keller and Banzhaf talked about the importance of this
genetic code (and its importance in relation to neutrality if I recall).
When I read your work, I tried to see how your genetic code would
improve the interpretation of genomes in a way that was resilient
against the destructive powers of genetic operators.
In GEP the genetic code is much simpler than the one used in nature, and
it consists of a one-to-one relationship between the symbols of the
genome and the functions and variables they represent. So, in GEP, the
genetic code has nothing to do with the “resiliency against the
destructive powers of the genetic operators”. What is responsible for
that is the unique structure of GEP genes, that is, a head and tail
organization that not only guarantees syntactic closure but also
provides for non-coding regions that allow the accumulation of neutral
mutations. And unless you understand this, it would be very difficult to
understand the differences between GEP and Banzhaf’s Developmental GP or
Grammatical Evolution for example.
You seem to be saying that the random generation of the
initial population is not guaranteed to produce viable programs. But
your whole argument for the use of the GEP representation scheme is that
you always get viable programs – I don’t understand!
You are confounding ‘viable’ with ‘valid’ (a program is viable when its
fitness is greater than zero and unviable otherwise; and by definition
viability is the fitness divided by average fitness). In GEP, all the
programs, including the ones generated randomly for the initial
population, are structurally correct or valid. As for being viable or
not, is a property of both the fitness function (a program might be
unviable because one is using a very harsh fitness measure) and its
mathematical output in the particular environment one is using (for
example, the evaluation of a perfectly valid program with a division
term, say x/y, might result in an unviable program if
among the fitness cases one of them has y = 0. And I say “might”
because one could choose to keep these programs around, although I never
do in my GEP implementations).
Where did the name Karva come from? Does it have another
meaning?
It's short for my Mother’s name, Carvalho, which is also my first
surname (my entire family name is “de Carvalho Ferreira”). I also chose
to use “K” instead of “C” because I like to think of K-expressions as
Kappa-expressions.
I am trying to understand the difference between
K-expressions and Karva. Is Karva a notation or a language such as C++
or Java? If Karva is a language like C++ or Java, should I need a
compiler?
Karva is just a notation used to represent the linear genomes of
computer programs. And a K-expression is the coding part (or ORF, which
stands for Open Reading Frame) of a gene in these linear chromosomes.
Many GEP genetic operators appear to be what GP people
usually call "highly disruptive." I would say that those operators are
even more disruptive than the usual GP operators (subtree swapping or
so). So, this may be good for search space exploration, but not for
search space exploitation. Did you ever consider this in an analytical
manner, or note such effects? If GEP really evolves, where is the
"backstore" of the genes (as e.g. compared to building blocks in GA)?
It is true that most of GEP operators (e.g., mutation, IS and RIS
transposition, inversion, one- and two-point recombination) are more
disruptive than the tree-level recombination of GP, for these operators
act at the chromosome level and their effects on the trees (fully
developed entities or phenotype) can lead to a wide spectrum of
modifications: from very profound changes to neutral changes. This
interplay between genotype/phenotype is what makes GEP much more
powerful than GP because tree-level operators cannot modify drastically
an entity (every gardener knows that it is impossible to make an orange
tree produce bananas only by grafting and pruning). So, I don’t agree
with you when you say that ‘disruptive’ operators are not good for
search space exploitation. This idea has no support whatsoever in
evolution: mutation is considered by many the most important operator
and, furthermore, it is known that not even mutation is capable of
creating all the wonders of nature and other, more radical, mechanisms
are essential (consider for instance the origin of eukaryotes by
symbiosis). The work I’ve been doing on GEP operators shows that
mutation (and I am talking here about point mutation at chromosome
level, not the tree-level mutation of GP) is by far the most important
operator, followed by RIS transposition (the most ‘disruptive’ GEP
operator as it changes the root itself), IS transposition, inversion,
two-point recombination, one-point recombination and finally gene
recombination, which is the most conservative operator. It is also worth
pointing out that GEP gene recombination is very close, in effect, to GP
tree-level recombination as in this case a special kind of building
block (genes that encode a particular sub-expression tree) are
exchanged. Of course, in GEP, entire genes are not the only kind of
building block that are exchanged but at least these are very easily
traced.
Due to the alternating representation of the trees used
in GEP, it becomes easier to define new genetic operators. However, it
would be hard to define subtree swapping (the usual crossover of GP)
based on exactly this representation. The obvious approach is to merge
both approaches by changing the tree representation. At least, this may
reduce the disruptive effect of usual GEP operators in later phases of
the search.
What’s the use of implementing a GP-like crossover in GEP besides a huge
headache? It’s not as if the efficiency of the tree-level crossover of
GP is commendable in evolutionary terms. Notwithstanding, in GEP, when
you use gene recombination together with gene transposition as the only
source of genetic modification you have basically a GP subtree crossover
as, in this case, only subtrees of different sizes and shapes are
exchanged. And compared to mutation (GEP mutation), they are far less
efficient at driving evolution further.
If I understand correctly, GEP crossover is done at a
certain point in the linear representation of the individual regardless
of its corresponding position in its tree-version (indeed, it could even
fall outside) and basically exchanges the "bottom part" of the programs.
The children are, intuitively, very unlikely to inherit their parent's
characteristics. Again intuitively, wouldn't that be counterproductive?
It is a good way of maintaining diversity and exploring new regions of
the genospace, but doesn't seem to be too good at advancing a particular
"branch" of exploration.
Evolutionarily speaking, this is not counterproductive at all: if the
genetic operators available to the system are unable to disrupt the
genomes thoroughly once in a while, then the system will adapt poorly.
Mathematically, the disruption of building blocks might seem
counterproductive but for evolutionary systems it is fundamental. The
best way to see this is through experimentation: If you still don't have
a functional GEP implementation, you can in the meantime observe this by
using the free Demo of GeneXproTools
by selecting different genetic operators and/or by selecting different
combinations and rates of the basic GEP operators.
But returning to the wrongly assumed “disruptiveness” of GEP crossover.
First of all, there are three different types of GEP crossover
(one-point, two-point, and gene recombination) and they behave
differently. For one-point and two-point recombination, the disruptive
tendencies (splitting of building blocks) coexist side by side with
their more conservative tendencies (swapping of genes and entire ORFs),
making these operators quite well balanced. Gene recombination, though,
is much more conservative (and less disruptive, of course) as it
exchanges only entire genes, which, as you know, encode mathematically
concise building blocks (you can think of them as the equivalent of GP
subtrees and by using gene recombination together with gene
transposition you can easily simulate GP crossover in GEP).
Your "fixed length" representation brings up an
interesting question, which tickles GP people for a long time: the
validity of the schemata theorem. Did you ever think about this? I have
a little bit the impression that your representation may make it easier
to identify schemata, and model the ways they are either disrupted or
preserved. But let me know if I am wrong!
I think you are right. I also think that the existence of building
blocks is more easily perceived in GEP. The possibility of dealing with
genes (building blocks) separately is a great advantage. I believe that
there are other smaller building blocks that circulate in the genetic
pool, but their identification is not so easily done. Perhaps it would
be easier to identify them using the linear representation, but there is
the problem of the phenotype. At least in biology, domains in proteins
(building blocks) are trickier to identify. Of course genes are easily
identified and can be identified directly in the genome or as products
of expression. So, the theoreticians could start by studying the easily
manipulated GEP genes. In fact, in my
Complex Systems paper I demonstrate (although not formally)
the validity of the schemata theorem for GEP at least: The multigenic
systems are much more effective than unigenic ones. Furthermore, the
genetic operators of GEP are ideal to analyze the circulation of genes
and we can choose to have the genes disrupted or not as necessary.
How did you arrive at 0.044 mutation rates? I think I
noticed you using that more than once not only in your
papers but also
in your
book.
In the early days of GEP implementation I experimented with different
mutation rates and analyzed their evolutionary dynamics. At the time I
used to use a mutation rate equivalent to two one-point mutations per
chromosome. But then lots of people started to say that I tinkered with
the parameters as a mutation rate thus determined, although biologically
more meaningful, changes with chromosome length. So, I just realized
that for the average chromosome length I was using at the time, a
mutation rate of 0.044 is around the same value obtained for two
one-point mutations per chromosome and I just started to use it as
default. But I could have very well chosen another value, say 0.05,
0.047 or 0.055. As anyone can easily find out through experimentation
(try for example the free Demo of
GeneXproTools), small mutation rates keep your system evolving: if
you use very small a rate, you might need a little more time to get to a
good solution to your problem, but you'll get there nonetheless. But
more formally, the evolutionary dynamics of GEP populations are
documented in the last chapter of my
book (the first edition is freely available
online) and also in several of my papers,
especially in the
paper on Evolutionary Dynamics.
I have some questions about crossover in GEP. From what I
have seen, the "building blocks" theory has been very contested in the
GAs world, and though I have (partly) read Langdon & Poli's "Foundations of Genetic Programming", it seems that its applicability to GP is
not very well known and difficult to establish.
Concerning crossover in GEP, I discuss its
power and its role both in my
book (see especially the last chapter which is available
online for the first edition) and in quite a few papers, especially
the following:
I don't know if you are familiar with them, but there
you will find also some thoughts on the role of crossover in evolution
and its implications for artificial evolutionary systems, including GP.
From a purely intuitive point of view, I would say that
one could equate building blocks to subtrees. At least, seeing a GP
individual as a program one could say that the subtrees represent
"subroutines", therefore by combining "good (or useful) subroutines" one
should get "good individuals".
This in principle is correct, and I believe that the exchange of
building blocks is essential for evolution, but one should never forget
that it is neither the only driving force of evolution nor the most
important one. Indeed, without mutation, evolutionary systems evolve
poorly for, with time, populations undergoing only crossover will lose
all genetic diversity.
I would like to know your opinion on how GEP compares to
GP. Though a GP-GEP comparison is not technically part of our thesis,
our usage of GEP must be discussed, and I am also quite interested in
the subject personally.
As evolutionary systems, GEP and GP could not be more different, so
their comparison is always complicated. Theoretically, however, the
differences between both systems can be clearly delineated. On the one
hand, GP is a simple replicator system without an autonomous genotype,
whereas GEP is a genotype/phenotype system with a perfect mapping. This
means that GEP and GP occupy different levels in the evolutionary ladder
and, because of this, there are quite a few things restricted to GP,
among which the set of genetic operators available and consequently how
well it adapts and evolves. As you know, GP relies basically on a
tree-specific crossover (and for GP it seems to be the best operator as
Koza not even once uses mutation or permutation in his first
GP book) but this
does not mean that crossover (much less tree-specific) is the most
effective genetic operator for all evolutionary systems, both artificial
and natural.
When you apply the genetic operators, is their order
important?
The order in which you apply the genetic operators is not important in
the long run: each one dictates a certain path, of course, but any path
is as good as any as long as you keep moving towards the top.
Which language may be best suited for Gene Expression
Programming?
I implemented Gene Expression Programming in C++, but any other language
can be used to implement it. My advice is to choose the language you are
more comfortable with.
Do you have any comments in using LISP to implement GEP?
Is there any preference of using one language over the others in
relation to GEP?
You can use any language to implement GEP. If you are most comfortable
with LISP, then you should use it. My own GEP implementations are in C++
but I also ported some of it to Visual Basic and Java just for fun.
By reading through your
papers and
book I also wondered why you decided to impose a fixed length on GEP
chromosomes. If I understand correctly one only has to make sure that
the tail grows together with the head to assure that syntactically
correct programs are produced. Allowing for chromosome growths would
allow GEP to adjust itself to the complexity of the problem.
It makes things much simpler. You could do what you are saying (allow a
gene to change size) or, considerably simpler, allow for a chromosome to
change size by allowing the number of genes to change. But it seems to
me that the gains (if any) of such a system would never compensate for
the added complexity. And there are other, much more interesting, ways
of making the system plastic. One is of course the very structure of GEP
genes that allows the encoding of expression trees of different sizes
and shapes, which, in my view, more than makes up for the fixed length
chromosomes. Another is the cellular and multicellular systems that
allow code reuse and automatic linking or, in other words, the evolution
of automatically defined functions.
You state the advantages of a genotype/phenotype mapping
without using a repair mechanism. GEP would not need such a mechanism.
You also refer to the natural system to bolster the claim of advantages.
However, it is well known that there are repair (edit) mechanisms in the
natural system (e.g. splicing), so that it is simply not the case that
DNA to protein does not contain such mechanisms. Also, I wonder what the
advantage would be of not having repair, if only the result is
evaluated.
I cannot agree with you here. In nature, the phenotype (proteins) is
always valid and as such is never repaired. I mean, after translation
you always end up with an amino acid chain and this amino acid chain is
always a valid protein structure. Splicing is an altogether different
matter: it occurs at the RNA level and its function has nothing to do
with repair of invalid structures. Indeed if we were to clone a gene
with introns into a bacterium, these introns would not be removed and
the entire sequence of the gene would be translated into a protein.
Obviously this protein will be very different from the original but
nonetheless it would be a valid protein structure. So, the advantages of
not having repair (and I mean repair of invalid structures, not the
equivalent to the biological editing) is that nothing is wasted and that
all the modifications made in the genome by the genetic operators always
produce something new and valid that can be put to good use.
To evaluate a gene or K-expression in GEP, we have to
build the tree of the expression. In my GEP implementation, I do this in
memory and it takes some time. In other approaches, e.g. linear tree
encoding, it is possible to evaluate an expression recursively without
actually building the tree. With GEP, I cannot figure it out whether
there is an algorithm to evaluate an expression *without* building the
tree, but just working on the string. It would be great if you could
show me the sketch (just the sketch, I am not asking for the code) of
the procedure you use to evaluate an individual.
Maybe there is a more elegant way of evaluating GEP strings using
recursion, but I actually build the tree to evaluate them. I give all
the details in my
Complex Systems paper and
book which are both available online. Nonetheless, I'll try and give you
here a brief sketch of how I do it:
For instance, the gene with a head length of 6:
+a/*-+aabacaa
gives the following tree:
+
/ \
/ \
/ \
a /
/ \
/ \
/ \
* -
/ \ / \
+ a a b
/ \
a c
I use two-dimensional arrays to build such trees:
+
a/
*-
+aab
ac
Then these trees (and a shadow tree where the actual evaluations are
kept) are used to make the evaluations. To evaluate the trees you start
from the bottom, beginning the search for the functions (from left to
right) on the line before last (line 3 in this case; the line with the
root is line 0). Then each time a function is found on line n, you
retrieve from line n+1 the corresponding number of terminals and do the
operation, and so forth until no more functions are found on the current
line. Then you do the same for the next lines. Obviously you must keep
tract of the functions and terminals you process on each line.
This is in my view very algorithmic and you should have no problems in
implementing it.
I think your work is great and I’m fascinated by it, but
I couldn’t help but notice that there are quite a few people accusing
you of the sin of not citing work that anticipated yours. Most name
Conan Ryan as one such forerunner. Could you please inform me about the
citation accusation?
I don’t think these accusations are fair. I cited all the work that was
relevant for my work and that influenced my thinking. I also cited all
the sources that I read at the time, the ones that I thought might have
a connection to my work. In fact, I read most of these sources after
having created GEP just to be sure that nothing similar existed already.
Notwithstanding, I also cited these sources in my
paper even though they had not influenced my work in any way.
So, I did the best I could with the resources that I had at the time and
then I submitted my paper to scientific journals. And it was rejected
with reviews that still leave me gasping. And the most interesting was
that, although theoretically confidential, word got out and Koza was
straightaway warned and didn’t have any problems telling me (as editor
of my book at Kluwer) that someone told him that my work was not what I
was claiming it to be. So, fearing the worst, I didn’t try another
journal to publish my paper (that was already my second try) and instead
published it online and I even included the works that were suggested by
the reviewers that so unceremoniously rejected it and also took into
consideration most of their remarks to make an even better paper. And of
course, they didn’t like it any when I dropped this on them.
Fortunately, there are also good people in this field and not everything
and everyone is controlled by Koza, and immediately after publishing my
paper online I was invited by two scientific journals to publish my
paper with them. And at
Complex Systems, my
paper went the normal reviewing process and with their input I
created an even better paper (both papers are available online if you
want to compare them) with yet more references also suggested by the
reviewers (there are 14 references in the first
online version and 18 in the
Complex Systems paper).
But anyway, when I announced my paper online (the first version) I was
immediately attacked by several people that said that it was a good
thing that it was rejected because of its many sins including the
cardinal sin of not citing such and such (Ryan among them). That was the
first time I’ve heard of Ryan and his work and since he also joined the
protesters (together with Mike O’Neil) with a rather tame bashing I took
a look at his work and, to this day, I don’t see why anybody thinks that
I should have cited his work in my paper (the genotype/phenotype mapping
in Grammatical Evolution is not even perfect, for crying out loud!).
Even his later papers have nothing to do with GEP although he has tried
to make some kind of a posteriori connection. There are people and
papers I wish I had known about at the time I created GEP to make an
even better paper (Franz Oppacher and Tim Perkis among them) but Ryan is
not one of them.
This whole controversy around GEP and GE is really
regrettable because I think your research on GEP and the research on GE
are both extremely good. From a broad perspective the differences are
not large, in my opinion. In detail, the two systems differ greatly in
approach while sharing the principle of genotype to phenotype mapping, a
great advance for the EC field.
I’m afraid I cannot agree with you when you say that the differences
between GEP and GE are not large (even though you are using a broad
perspective). And this is so because the genotype-phenotype mapping in
GE is not perfect (as in Banzhaf’s DGP, by the way) and therefore not
all the programs created during the adaptation process are valid. Citing
from the
seminal paper on Grammatical Evolution:
|
“It is possible for individuals to run out of genes, and in this
case there are two alternatives. The first is to declare the
individual invalid and punish them with a suitably harsh fitness
value; the alternative is to wrap the individual, and reuse the
genes.” |
|
And systems like these, when compared to systems with perfect
genotype-phenotype mappings such as GEP, are a waste of time and
resources because they evolve poorly and have none of the plasticity a
genuine genotype-phenotype system has.
You're right! I underestimated the possibility under GE
that individuals would be invalid and the fix would be to wrap them. I
just looked again at their 1999 paper, "Under
the hood of grammatical evolution", and it is clear that invalid
individuals are numerous early on in a run and that a large fraction of
the population may sometimes be wrapped throughout a run. Against this,
GEP does indeed guarantee that every individual is valid (you don't
prove it in your early papers, but I believe it :) ). The validity
difference could be highly significant in terms of results and power.
You must know at this point. They don't seem to have results showing GE
is orders of magnitude better than GP but you say you do. To me,
wrapping looks like a fix that could undermine search (even though they
gamely cite some biological precedent).
It’s good to know that you are seeing more clearly the differences
between GEP and GE, for this is the essence of a truly functional
genotype-phenotype system. Indeed, all the GEP algorithms I implemented
(GEP, GEP with random numerical constants, ADFs, Evolvable Neural
Networks, Decision Tree Induction, Polynomial Induction, Parameter
Optimization, and so on) always produce valid programs. And for that,
you just have to follow the simple structural rules for evaluating the
length of the tail and the additional domains used when random numerical
constants are involved (this is, in my view, proof enough, but obviously
more mathematically inclined people might still need to prove this more
formally :) ).
I empathize with your skepticism about ever being
welcomed to the EC community, but I am absolutely sure GEP will be (1)
let in, (2) taken seriously, then ultimately (3) accepted with at least
the same degree of warmth that, say, Goldberg has for Koza's GP. The EC
field has seen several cold-shoulderings and yours is not by any means
the first and certainly not the most bitter, though you may today feel
that way. Meanwhile, it sounds like the acceptance of GEP via your
company's software is going great!
I really hope your predictions about GEP will come true sooner rather
than later although I don’t think I should feel reassured by what you
call “the same degree of warmth that, say, Goldberg has for Koza's GP”
:) And yes, my company’s software has been a lifesaver and a source of
great fulfillment and the right outlet to show exactly what can be
achieved with GEP. Everybody can download the free Demo of
GeneXproTools and play with GEP –
it’s great fun!
What kind of software packages are available based on
Gene Expression Programming (any demo version)?
Gepsoft GeneXproTools implements
several GEP algorithms and supports symbolic regression (function
finding), classification (logistic regression), logic synthesis, and
time series prediction. You can
download the Demo at
the Gepsoft website. The Demo comes with several sample runs for which
the software is totally functional, including seeing the code of the
models it designs in 16 different programming languages. It should give
you a feel for the performance and workings of GEP as you can change a
wide array of settings, including chromosome and gene structures,
linking functions, genetic operators, modification rates, fitness
function, parsimony pressure, function set, random numerical constants,
and so on.
I must say I'm very surprised by the fact that the
uni-ADF systems consistently outperform the multi ADF systems (in your
paper on Automatically Defined Functions). It seems
counterintuitive. I have a suspicion it might have to do with the
(simple) problem-space used for the experiments – But obviously have
nothing to back up this suspicion.
You are right in your suspicions. Indeed, the problem is so simple
(note, however, that it’s the same problem chosen by Koza to illustrate the
workings of his ADF system) it can be solved using just one kind of
module, hence the advantage of a uni-ADF system to tackle this kind of
problem. But obviously more complex, real-world problems are hardly ever
solved using just one kind of module and multi-ADF systems are required
to solve them efficiently.
Have you yourself conducted any experiments to verify if
multi ADF systems would (or not) outperform uni-ADF systems on more
'complex' problem spaces. Or would you be aware of any such experiments,
and could point me towards these?
Yes, I solved different kinds of problems (odd-parity functions, breast
cancer, analog circuit design, irises, etc.) for the second edition of
my
book using multi ADF systems and they are quite promising,
especially to solve classification problems with more than two classes
in one go.
I really like how you encode ADFs in homeotic genes –
it’s really neat! However I’m not sure how you apply the genetic
operators on homeotic genes?
Concerning the genetic operators on homeotic genes, it’s really simple.
One-point and two-point recombination are applied exactly as they are in
normal chromosomes, with the crossover points chosen throughout the
genome as it does not interfere with syntax. As for the other operators
(mutation, IS and RIS transposition, gene recombination and gene
transposition), they are kept separated from their counterpart for
normal genes and are applied only in homeotic genes (notice that I chose
to control gene recombination in homeotic genes independently of gene
recombination in normal genes but obviously you could also choose to
have a general
gene recombination operator operating on the entire genome). Consider,
for instance, mutation. You have a mutation rate pm
that is applied only on normal genes and another mutation rate pmh
that is applied only on homeotic genes. I found this way of doing things
not only more efficient but also more practical as one can better
control and understand the effects of the different operators in
different regions of the genome.
I have a question regarding your
paper on neural networks. I am aware about some work where many
authors have attempted to use GA for optimizing neural networks
architecture and parameters. Further, neural networks have shown
tremendous success in the area of various engineering applications where
the functional relationship is not very well defined. In such
situations, for multiple input parameters, in what way your present
approach of GEP-nets will help in developing neural networks for single
output parameter neural networks. Could you please let me know your
views?
Concerning your question, I think the whole
paper is my answer to that question :). But making a long story
short, the GEP-nets algorithm allows not only the design of the neural
network architecture but also the fine-tuning of the weights and
thresholds. So this new algorithm proposes a different way of solving
problems using this new type of evolvable neural networks. That is, the
neural networks designed by the GEP-nets algorithm are by themselves
complete solutions to problems, not partial aspects of conventional
neural networks.
In these evolvable neural networks some weights or
thresholds would be modified when a mutation occurred. How does GEP
modify their values?
The GEP-nets algorithm allows the adaptation of both weights and
thresholds through the use of genetic operators. Most of them contribute
to their adaptation indirectly (for instance, point mutation,
transposition, recombination), but the one you mentioned – direct
mutation of weights and thresholds – can change the weights and
thresholds directly. I typically use a small rate of about 0.01 for this
operator as the work done by the other operators is more than sufficient
to fine-tune the weights and thresholds of the evolving neural networks.
So, and answering to your question, for instance, for a population of 50
neural networks encoded in chromosomes with three genes, each gene with
a Dw domain and the respective array of, say, 10 weights, there are a
total of 50*3*10=1500 potential targets to change; a rate of 0.01 for
the ”direct mutation of weights operator” (sorry about the lengthy name,
but I don’t want you to confound it with the mutation operator that
changes the elements in the chromosome) means that, in this case, a
total of 15 weights are randomly mutated per population per generation.
I am wondering if your GEP system has implemented strong
typing or not as in Strongly Typed GP. It looks strong typing can reduce
the search space greatly?
If needed, strong typing can be easily implemented in GEP. However, most
of the time, if your encoding is good you won't need this. Consider, for
instance, the implementation of neural networks in GP as proposed by
Jonh Koza in his first
GP book. With that kind of encoding you must obviously have
strong typing, with the rules for specifying syntax clearly defined. Or
you can go around this as David Montana suggests in is paper “Strongly
Typed Genetic Programming”. And now consider my
GEP Neural Networks, with their weights and thresholds vectors
encoded in two different gene domains (Dw and Dt). So, as you can see,
there are other, more elegant and more functional, ways of dealing with
complex structures (not only neural networks but also decision trees and
polynomials for example) which do not require strong typing (see
chapters 7 and 9 of my
book to learn how to encode polynomials and decision trees with
numeric/mixed attributes in GEP).
Questions &
Answers
***
|