Questions & Answers
from Personal Correspondence


Q&A from Personal Correspondence

These are answers to questions I've received over the years from people interested in Gene Expression Programming. It’s been a long journey and I’ve made quite a few friends and acquaintances along the way. I’ve enjoyed every single email and I did my best to be helpful and encouraging to all of you. I hope I have succeeded. Thank you all for contacting me and also for your encouragement! I believe these Q&A might be useful to others, hence my decision to post them here. I hope you’ll be able to find something interesting or new in them. As always, for privacy sake, I removed any references that could identify the people involved.


I very much like the thought of designing a language where every ordering of the most basic units is a valid program. This is the spirit of Karva, right?

I think this is fundamental. Of course systems that produce invalid programs also work, but they are very inefficient. As I said in my Complex Systems paper, in nature there is no such thing as an invalid protein as all full-fledged genotype/phenotype systems have syntactic closure.
 

Still about syntactic closure. I think it is ok for programs to not always work out. I have mixed feelings about the need for head and tail sections for GEP. If there are too many operators, say, then couldn't we just let the gene have a low fitness? (Not for GEP as in your paper, no, there I think it is indeed nice). Can you see what I mean when talking about the difference between the structure of the language leading to invalid evaluation and say a particular gene not having a sufficient variable to operator ratio?

Of course you could remove the invalid programs from the population, but I’m afraid they would be the vast majority and lots of resources would be lost in that. I don’t think it is a good idea to have them around; if they must exist, better exclude them so that they don’t spread in the population. For example, whenever a program gives a calculation error for a particular fitness case (say, a division by zero), I simply exclude this program from the population (contrary to GP, I don’t use protected divisions or protected square roots or logarithms... and keep these problematic programs around in the population: They are unfit and, as such, their fitness is zero which gives them zero chance of leaving offspring). My experience is that even carrying these programs around is not beneficial in evolutionary terms; and carrying structurally defective programs is a total waste of time and resources. So, if there is a way to produce always syntactically correct programs, we must use it.
 

I would like to know the differences between GP and GEP if the mapping from genotypes (where a GEP operates) to phenotypes (where a GP operates) is injective. It might seem that technically they are the same thing by clear rules of transitivity.

It is true that in GEP one genotype corresponds to one phenotype and vice versa. Maybe in mathematics this will be the end of the story, but not in biology or in evolutionary computation. When the only thing you have is a phenotype, there are lots of places on the search landscape that are forever inaccessible. On the other hand, in genuine genotype-phenotype systems the genotype allows the unconstrained exploration of the solution space. What I’ve been saying about grafting and pruning clearly illustrates this: It is impossible to make an orange tree produce mangos only by grafting and pruning. And grafting and pruning is the best a GP system can do.
 

Using terms like Gene Expression Programming or Developmental Genetic Programming, I suppose the emphasis should always be on the mapping. The mapping is all I have considered. This mapping is analogous to transcription as we want to produce a structure (i.e. mRNA) that describes a program given a genome. The mapping that codes a unit of mRNA from parts of the genome is called the genetic code, if I am not mistaken.

In biology one usually represents the information flow from DNA to proteins by DNA -> RNA -> Protein. The first phase (DNA -> RNA) is what is called transcription and requires not a genetic code but some simple complementary rules between nucleotides. The genetic code is only used to express an mRNA into a protein, that is, in the second phase (RNA -> Protein). This second phase is called translation, and here we have indeed the real mapping from the genotype (mRNA is nothing more than a working copy of a gene) to the phenotype (protein).
 

This genetic code is one possible focus of gene expression. Keller and Banzhaf talked about the importance of this genetic code (and its importance in relation to neutrality if I recall). When I read your work, I tried to see how your genetic code would improve the interpretation of genomes in a way that was resilient against the destructive powers of genetic operators.

In GEP the genetic code is much simpler than the one used in nature, and it consists of a one-to-one relationship between the symbols of the genome and the functions and variables they represent. So, in GEP, the genetic code has nothing to do with the “resiliency against the destructive powers of the genetic operators”. What is responsible for that is the unique structure of GEP genes, that is, a head and tail organization that not only guarantees syntactic closure but also provides for non-coding regions that allow the accumulation of neutral mutations. And unless you understand this, it would be very difficult to understand the differences between GEP and Banzhaf’s Developmental GP or Grammatical Evolution for example.
 

You seem to be saying that the random generation of the initial population is not guaranteed to produce viable programs. But your whole argument for the use of the GEP representation scheme is that you always get viable programs – I don’t understand!

You are confounding ‘viable’ with ‘valid’ (a program is viable when its fitness is greater than zero and unviable otherwise; and by definition viability is the fitness divided by average fitness). In GEP, all the programs, including the ones generated randomly for the initial population, are structurally correct or valid. As for being viable or not, is a property of both the fitness function (a program might be unviable because one is using a very harsh fitness measure) and its mathematical output in the particular environment one is using (for example, the evaluation of a perfectly valid program with a division term, say x/y, might result in an unviable program if among the fitness cases one of them has y = 0. And I say “might” because one could choose to keep these programs around, although I never do in my GEP implementations).
 

Where did the name Karva come from? Does it have another meaning?

It's short for my Mother’s name, Carvalho, which is also my first surname (my entire family name is “de Carvalho Ferreira”). I also chose to use “K” instead of “C” because I like to think of K-expressions as Kappa-expressions.
 

I am trying to understand the difference between K-expressions and Karva. Is Karva a notation or a language such as C++ or Java? If Karva is a language like C++ or Java, should I need a compiler?

Karva is just a notation used to represent the linear genomes of computer programs. And a K-expression is the coding part (or ORF, which stands for Open Reading Frame) of a gene in these linear chromosomes.
 

Many GEP genetic operators appear to be what GP people usually call "highly disruptive." I would say that those operators are even more disruptive than the usual GP operators (subtree swapping or so). So, this may be good for search space exploration, but not for search space exploitation. Did you ever consider this in an analytical manner, or note such effects? If GEP really evolves, where is the "backstore" of the genes (as e.g. compared to building blocks in GA)?

It is true that most of GEP operators (e.g., mutation, IS and RIS transposition, inversion, one- and two-point recombination) are more disruptive than the tree-level recombination of GP, for these operators act at the chromosome level and their effects on the trees (fully developed entities or phenotype) can lead to a wide spectrum of modifications: from very profound changes to neutral changes. This interplay between genotype/phenotype is what makes GEP much more powerful than GP because tree-level operators cannot modify drastically an entity (every gardener knows that it is impossible to make an orange tree produce bananas only by grafting and pruning). So, I don’t agree with you when you say that ‘disruptive’ operators are not good for search space exploitation. This idea has no support whatsoever in evolution: mutation is considered by many the most important operator and, furthermore, it is known that not even mutation is capable of creating all the wonders of nature and other, more radical, mechanisms are essential (consider for instance the origin of eukaryotes by symbiosis). The work I’ve been doing on GEP operators shows that mutation (and I am talking here about point mutation at chromosome level, not the tree-level mutation of GP) is by far the most important operator, followed by RIS transposition (the most ‘disruptive’ GEP operator as it changes the root itself), IS transposition, inversion, two-point recombination, one-point recombination and finally gene recombination, which is the most conservative operator. It is also worth pointing out that GEP gene recombination is very close, in effect, to GP tree-level recombination as in this case a special kind of building block (genes that encode a particular sub-expression tree) are exchanged. Of course, in GEP, entire genes are not the only kind of building block that are exchanged but at least these are very easily traced.
 

Due to the alternating representation of the trees used in GEP, it becomes easier to define new genetic operators. However, it would be hard to define subtree swapping (the usual crossover of GP) based on exactly this representation. The obvious approach is to merge both approaches by changing the tree representation. At least, this may reduce the disruptive effect of usual GEP operators in later phases of the search.

What’s the use of implementing a GP-like crossover in GEP besides a huge headache? It’s not as if the efficiency of the tree-level crossover of GP is commendable in evolutionary terms. Notwithstanding, in GEP, when you use gene recombination together with gene transposition as the only source of genetic modification you have basically a GP subtree crossover as, in this case, only subtrees of different sizes and shapes are exchanged. And compared to mutation (GEP mutation), they are far less efficient at driving evolution further.
 

If I understand correctly, GEP crossover is done at a certain point in the linear representation of the individual regardless of its corresponding position in its tree-version (indeed, it could even fall outside) and basically exchanges the "bottom part" of the programs. The children are, intuitively, very unlikely to inherit their parent's characteristics. Again intuitively, wouldn't that be counterproductive? It is a good way of maintaining diversity and exploring new regions of the genospace, but doesn't seem to be too good at advancing a particular "branch" of exploration.

Evolutionarily speaking, this is not counterproductive at all: if the genetic operators available to the system are unable to disrupt the genomes thoroughly once in a while, then the system will adapt poorly. Mathematically, the disruption of building blocks might seem counterproductive but for evolutionary systems it is fundamental. The best way to see this is through experimentation: If you still don't have a functional GEP implementation, you can in the meantime observe this by using the free Demo of GeneXproTools by selecting different genetic operators and/or by selecting different combinations and rates of the basic GEP operators.

But returning to the wrongly assumed “disruptiveness” of GEP crossover. First of all, there are three different types of GEP crossover (one-point, two-point, and gene recombination) and they behave differently. For one-point and two-point recombination, the disruptive tendencies (splitting of building blocks) coexist side by side with their more conservative tendencies (swapping of genes and entire ORFs), making these operators quite well balanced. Gene recombination, though, is much more conservative (and less disruptive, of course) as it exchanges only entire genes, which, as you know, encode mathematically concise building blocks (you can think of them as the equivalent of GP subtrees and by using gene recombination together with gene transposition you can easily simulate GP crossover in GEP).
 

Your "fixed length" representation brings up an interesting question, which tickles GP people for a long time: the validity of the schemata theorem. Did you ever think about this? I have a little bit the impression that your representation may make it easier to identify schemata, and model the ways they are either disrupted or preserved. But let me know if I am wrong!

I think you are right. I also think that the existence of building blocks is more easily perceived in GEP. The possibility of dealing with genes (building blocks) separately is a great advantage. I believe that there are other smaller building blocks that circulate in the genetic pool, but their identification is not so easily done. Perhaps it would be easier to identify them using the linear representation, but there is the problem of the phenotype. At least in biology, domains in proteins (building blocks) are trickier to identify. Of course genes are easily identified and can be identified directly in the genome or as products of expression. So, the theoreticians could start by studying the easily manipulated GEP genes. In fact, in my Complex Systems paper I demonstrate (although not formally) the validity of the schemata theorem for GEP at least: The multigenic systems are much more effective than unigenic ones. Furthermore, the genetic operators of GEP are ideal to analyze the circulation of genes and we can choose to have the genes disrupted or not as necessary.
 

How did you arrive at 0.044 mutation rates? I think I noticed you using that more than once not only in your papers but also in your book.

In the early days of GEP implementation I experimented with different mutation rates and analyzed their evolutionary dynamics. At the time I used to use a mutation rate equivalent to two one-point mutations per chromosome. But then lots of people started to say that I tinkered with the parameters as a mutation rate thus determined, although biologically more meaningful, changes with chromosome length. So, I just realized that for the average chromosome length I was using at the time, a mutation rate of 0.044 is around the same value obtained for two one-point mutations per chromosome and I just started to use it as default. But I could have very well chosen another value, say 0.05, 0.047 or 0.055. As anyone can easily find out through experimentation (try for example the free Demo of GeneXproTools), small mutation rates keep your system evolving: if you use very small a rate, you might need a little more time to get to a good solution to your problem, but you'll get there nonetheless. But more formally, the evolutionary dynamics of GEP populations are documented in the last chapter of my book (the first edition is freely available online) and also in several of my papers, especially in the paper on Evolutionary Dynamics.
 

I have some questions about crossover in GEP. From what I have seen, the "building blocks" theory has been very contested in the GAs world, and though I have (partly) read Langdon & Poli's "Foundations of Genetic Programming", it seems that its applicability to GP is not very well known and difficult to establish.

Concerning crossover in GEP, I discuss its power and its role both in my book (see especially the last chapter which is available online for the first edition) and in quite a few papers, especially the following:

I don't know if you are familiar with them, but there you will find also some thoughts on the role of crossover in evolution and its implications for artificial evolutionary systems, including GP.
 

From a purely intuitive point of view, I would say that one could equate building blocks to subtrees. At least, seeing a GP individual as a program one could say that the subtrees represent "subroutines", therefore by combining "good (or useful) subroutines" one should get "good individuals".

This in principle is correct, and I believe that the exchange of building blocks is essential for evolution, but one should never forget that it is neither the only driving force of evolution nor the most important one. Indeed, without mutation, evolutionary systems evolve poorly for, with time, populations undergoing only crossover will lose all genetic diversity.
 

I would like to know your opinion on how GEP compares to GP. Though a GP-GEP comparison is not technically part of our thesis, our usage of GEP must be discussed, and I am also quite interested in the subject personally.

As evolutionary systems, GEP and GP could not be more different, so their comparison is always complicated. Theoretically, however, the differences between both systems can be clearly delineated. On the one hand, GP is a simple replicator system without an autonomous genotype, whereas GEP is a genotype/phenotype system with a perfect mapping. This means that GEP and GP occupy different levels in the evolutionary ladder and, because of this, there are quite a few things restricted to GP, among which the set of genetic operators available and consequently how well it adapts and evolves. As you know, GP relies basically on a tree-specific crossover (and for GP it seems to be the best operator as Koza not even once uses mutation or permutation in his first GP book) but this does not mean that crossover (much less tree-specific) is the most effective genetic operator for all evolutionary systems, both artificial and natural.
 

When you apply the genetic operators, is their order important?

The order in which you apply the genetic operators is not important in the long run: each one dictates a certain path, of course, but any path is as good as any as long as you keep moving towards the top.
 

Which language may be best suited for Gene Expression Programming?

I implemented Gene Expression Programming in C++, but any other language can be used to implement it. My advice is to choose the language you are more comfortable with.
 

Do you have any comments in using LISP to implement GEP? Is there any preference of using one language over the others in relation to GEP?

You can use any language to implement GEP. If you are most comfortable with LISP, then you should use it. My own GEP implementations are in C++ but I also ported some of it to Visual Basic and Java just for fun.
 

By reading through your papers and book I also wondered why you decided to impose a fixed length on GEP chromosomes. If I understand correctly one only has to make sure that the tail grows together with the head to assure that syntactically correct programs are produced. Allowing for chromosome growths would allow GEP to adjust itself to the complexity of the problem.

It makes things much simpler. You could do what you are saying (allow a gene to change size) or, considerably simpler, allow for a chromosome to change size by allowing the number of genes to change. But it seems to me that the gains (if any) of such a system would never compensate for the added complexity. And there are other, much more interesting, ways of making the system plastic. One is of course the very structure of GEP genes that allows the encoding of expression trees of different sizes and shapes, which, in my view, more than makes up for the fixed length chromosomes. Another is the cellular and multicellular systems that allow code reuse and automatic linking or, in other words, the evolution of automatically defined functions.
 

You state the advantages of a genotype/phenotype mapping without using a repair mechanism. GEP would not need such a mechanism. You also refer to the natural system to bolster the claim of advantages. However, it is well known that there are repair (edit) mechanisms in the natural system (e.g. splicing), so that it is simply not the case that DNA to protein does not contain such mechanisms. Also, I wonder what the advantage would be of not having repair, if only the result is evaluated.

I cannot agree with you here. In nature, the phenotype (proteins) is always valid and as such is never repaired. I mean, after translation you always end up with an amino acid chain and this amino acid chain is always a valid protein structure. Splicing is an altogether different matter: it occurs at the RNA level and its function has nothing to do with repair of invalid structures. Indeed if we were to clone a gene with introns into a bacterium, these introns would not be removed and the entire sequence of the gene would be translated into a protein. Obviously this protein will be very different from the original but nonetheless it would be a valid protein structure. So, the advantages of not having repair (and I mean repair of invalid structures, not the equivalent to the biological editing) is that nothing is wasted and that all the modifications made in the genome by the genetic operators always produce something new and valid that can be put to good use.
 

To evaluate a gene or K-expression in GEP, we have to build the tree of the expression. In my GEP implementation, I do this in memory and it takes some time. In other approaches, e.g. linear tree encoding, it is possible to evaluate an expression recursively without actually building the tree. With GEP, I cannot figure it out whether there is an algorithm to evaluate an expression *without* building the tree, but just working on the string. It would be great if you could show me the sketch (just the sketch, I am not asking for the code) of the procedure you use to evaluate an individual.

Maybe there is a more elegant way of evaluating GEP strings using recursion, but I actually build the tree to evaluate them. I give all the details in my Complex Systems paper and book which are both available online. Nonetheless, I'll try and give you here a brief sketch of how I do it:

For instance, the gene with a head length of 6:

    +a/*-+aabacaa

gives the following tree:

       +
      / \
     /   \
    /     \
   a       /
          / \
         /   \
        /     \
       *       -
      / \     / \
     +   a   a   b
    / \
   a   c


I use two-dimensional arrays to build such trees:

    +
    a/
    *-
    +aab
    ac


Then these trees (and a shadow tree where the actual evaluations are kept) are used to make the evaluations. To evaluate the trees you start from the bottom, beginning the search for the functions (from left to right) on the line before last (line 3 in this case; the line with the root is line 0). Then each time a function is found on line n, you retrieve from line n+1 the corresponding number of terminals and do the operation, and so forth until no more functions are found on the current line. Then you do the same for the next lines. Obviously you must keep tract of the functions and terminals you process on each line.

This is in my view very algorithmic and you should have no problems in implementing it.
 

I think your work is great and I’m fascinated by it, but I couldn’t help but notice that there are quite a few people accusing you of the sin of not citing work that anticipated yours. Most name Conan Ryan as one such forerunner. Could you please inform me about the citation accusation?

I don’t think these accusations are fair. I cited all the work that was relevant for my work and that influenced my thinking. I also cited all the sources that I read at the time, the ones that I thought might have a connection to my work. In fact, I read most of these sources after having created GEP just to be sure that nothing similar existed already. Notwithstanding, I also cited these sources in my paper even though they had not influenced my work in any way.

So, I did the best I could with the resources that I had at the time and then I submitted my paper to scientific journals. And it was rejected with reviews that still leave me gasping. And the most interesting was that, although theoretically confidential, word got out and Koza was straightaway warned and didn’t have any problems telling me (as editor of my book at Kluwer) that someone told him that my work was not what I was claiming it to be. So, fearing the worst, I didn’t try another journal to publish my paper (that was already my second try) and instead published it online and I even included the works that were suggested by the reviewers that so unceremoniously rejected it and also took into consideration most of their remarks to make an even better paper. And of course, they didn’t like it any when I dropped this on them. Fortunately, there are also good people in this field and not everything and everyone is controlled by Koza, and immediately after publishing my paper online I was invited by two scientific journals to publish my paper with them. And at
Complex Systems, my paper went the normal reviewing process and with their input I created an even better paper (both papers are available online if you want to compare them) with yet more references also suggested by the reviewers (there are 14 references in the first online version and 18 in the Complex Systems paper).

But anyway, when I announced my paper online (the first version) I was immediately attacked by several people that said that it was a good thing that it was rejected because of its many sins including the cardinal sin of not citing such and such (Ryan among them). That was the first time I’ve heard of Ryan and his work and since he also joined the protesters (together with Mike O’Neil) with a rather tame bashing I took a look at his work and, to this day, I don’t see why anybody thinks that I should have cited his work in my paper (the genotype/phenotype mapping in Grammatical Evolution is not even perfect, for crying out loud!). Even his later papers have nothing to do with GEP although he has tried to make some kind of a posteriori connection. There are people and papers I wish I had known about at the time I created GEP to make an even better paper (Franz Oppacher and Tim Perkis among them) but Ryan is not one of them.
 

This whole controversy around GEP and GE is really regrettable because I think your research on GEP and the research on GE are both extremely good. From a broad perspective the differences are not large, in my opinion. In detail, the two systems differ greatly in approach while sharing the principle of genotype to phenotype mapping, a great advance for the EC field.

I’m afraid I cannot agree with you when you say that the differences between GEP and GE are not large (even though you are using a broad perspective). And this is so because the genotype-phenotype mapping in GE is not perfect (as in Banzhaf’s DGP, by the way) and therefore not all the programs created during the adaptation process are valid. Citing from the seminal paper on Grammatical Evolution:

  “It is possible for individuals to run out of genes, and in this case there are two alternatives. The first is to declare the individual invalid and punish them with a suitably harsh fitness value; the alternative is to wrap the individual, and reuse the genes.”  

And systems like these, when compared to systems with perfect genotype-phenotype mappings such as GEP, are a waste of time and resources because they evolve poorly and have none of the plasticity a genuine genotype-phenotype system has.
 

You're right! I underestimated the possibility under GE that individuals would be invalid and the fix would be to wrap them. I just looked again at their 1999 paper, "Under the hood of grammatical evolution", and it is clear that invalid individuals are numerous early on in a run and that a large fraction of the population may sometimes be wrapped throughout a run. Against this, GEP does indeed guarantee that every individual is valid (you don't prove it in your early papers, but I believe it :) ). The validity difference could be highly significant in terms of results and power. You must know at this point. They don't seem to have results showing GE is orders of magnitude better than GP but you say you do. To me, wrapping looks like a fix that could undermine search (even though they gamely cite some biological precedent).

It’s good to know that you are seeing more clearly the differences between GEP and GE, for this is the essence of a truly functional genotype-phenotype system. Indeed, all the GEP algorithms I implemented (GEP, GEP with random numerical constants, ADFs, Evolvable Neural Networks, Decision Tree Induction, Polynomial Induction, Parameter Optimization, and so on) always produce valid programs. And for that, you just have to follow the simple structural rules for evaluating the length of the tail and the additional domains used when random numerical constants are involved (this is, in my view, proof enough, but obviously more mathematically inclined people might still need to prove this more formally :) ).
 

I empathize with your skepticism about ever being welcomed to the EC community, but I am absolutely sure GEP will be (1) let in, (2) taken seriously, then ultimately (3) accepted with at least the same degree of warmth that, say, Goldberg has for Koza's GP. The EC field has seen several cold-shoulderings and yours is not by any means the first and certainly not the most bitter, though you may today feel that way. Meanwhile, it sounds like the acceptance of GEP via your company's software is going great!

I really hope your predictions about GEP will come true sooner rather than later although I don’t think I should feel reassured by what you call “the same degree of warmth that, say, Goldberg has for Koza's GP” :) And yes, my company’s software has been a lifesaver and a source of great fulfillment and the right outlet to show exactly what can be achieved with GEP. Everybody can download the free Demo of GeneXproTools and play with GEP – it’s great fun!
 

What kind of software packages are available based on Gene Expression Programming (any demo version)?

Gepsoft GeneXproTools implements several GEP algorithms and supports symbolic regression (function finding), classification (logistic regression), logic synthesis, and time series prediction. You can download the Demo at the Gepsoft website. The Demo comes with several sample runs for which the software is totally functional, including seeing the code of the models it designs in 16 different programming languages. It should give you a feel for the performance and workings of GEP as you can change a wide array of settings, including chromosome and gene structures, linking functions, genetic operators, modification rates, fitness function, parsimony pressure, function set, random numerical constants, and so on.
 

I must say I'm very surprised by the fact that the uni-ADF systems consistently outperform the multi ADF systems (in your paper on Automatically Defined Functions). It seems counterintuitive. I have a suspicion it might have to do with the (simple) problem-space used for the experiments – But obviously have nothing to back up this suspicion.

You are right in your suspicions. Indeed, the problem is so simple (note, however, that it’s the same problem chosen by Koza to illustrate the workings of his ADF system) it can be solved using just one kind of module, hence the advantage of a uni-ADF system to tackle this kind of problem. But obviously more complex, real-world problems are hardly ever solved using just one kind of module and multi-ADF systems are required to solve them efficiently.
 

Have you yourself conducted any experiments to verify if multi ADF systems would (or not) outperform uni-ADF systems on more 'complex' problem spaces. Or would you be aware of any such experiments, and could point me towards these?

Yes, I solved different kinds of problems (odd-parity functions, breast cancer, analog circuit design, irises, etc.) for the second edition of my book using multi ADF systems and they are quite promising, especially to solve classification problems with more than two classes in one go.
 

I really like how you encode ADFs in homeotic genes – it’s really neat! However I’m not sure how you apply the genetic operators on homeotic genes?

Concerning the genetic operators on homeotic genes, it’s really simple. One-point and two-point recombination are applied exactly as they are in normal chromosomes, with the crossover points chosen throughout the genome as it does not interfere with syntax. As for the other operators (mutation, IS and RIS transposition, gene recombination and gene transposition), they are kept separated from their counterpart for normal genes and are applied only in homeotic genes (notice that I chose to control gene recombination in homeotic genes independently of gene recombination in normal genes but obviously you could also choose to have a general gene recombination operator operating on the entire genome). Consider, for instance, mutation. You have a mutation rate pm that is applied only on normal genes and another mutation rate pmh that is applied only on homeotic genes. I found this way of doing things not only more efficient but also more practical as one can better control and understand the effects of the different operators in different regions of the genome.
 

I have a question regarding your paper on neural networks. I am aware about some work where many authors have attempted to use GA for optimizing neural networks architecture and parameters. Further, neural networks have shown tremendous success in the area of various engineering applications where the functional relationship is not very well defined. In such situations, for multiple input parameters, in what way your present approach of GEP-nets will help in developing neural networks for single output parameter neural networks. Could you please let me know your views?

Concerning your question, I think the whole paper is my answer to that question :). But making a long story short, the GEP-nets algorithm allows not only the design of the neural network architecture but also the fine-tuning of the weights and thresholds. So this new algorithm proposes a different way of solving problems using this new type of evolvable neural networks. That is, the neural networks designed by the GEP-nets algorithm are by themselves complete solutions to problems, not partial aspects of conventional neural networks.
 

In these evolvable neural networks some weights or thresholds would be modified when a mutation occurred. How does GEP modify their values?

The GEP-nets algorithm allows the adaptation of both weights and thresholds through the use of genetic operators. Most of them contribute to their adaptation indirectly (for instance, point mutation, transposition, recombination), but the one you mentioned – direct mutation of weights and thresholds – can change the weights and thresholds directly. I typically use a small rate of about 0.01 for this operator as the work done by the other operators is more than sufficient to fine-tune the weights and thresholds of the evolving neural networks. So, and answering to your question, for instance, for a population of 50 neural networks encoded in chromosomes with three genes, each gene with a Dw domain and the respective array of, say, 10 weights, there are a total of 50*3*10=1500 potential targets to change; a rate of 0.01 for the ”direct mutation of weights operator” (sorry about the lengthy name, but I don’t want you to confound it with the mutation operator that changes the elements in the chromosome) means that, in this case, a total of 15 weights are randomly mutated per population per generation.
 

I am wondering if your GEP system has implemented strong typing or not as in Strongly Typed GP. It looks strong typing can reduce the search space greatly?

If needed, strong typing can be easily implemented in GEP. However, most of the time, if your encoding is good you won't need this. Consider, for instance, the implementation of neural networks in GP as proposed by Jonh Koza in his first GP book. With that kind of encoding you must obviously have strong typing, with the rules for specifying syntax clearly defined. Or you can go around this as David Montana suggests in is paper “Strongly Typed Genetic Programming”. And now consider my GEP Neural Networks, with their weights and thresholds vectors encoded in two different gene domains (Dw and Dt). So, as you can see, there are other, more elegant and more functional, ways of dealing with complex structures (not only neural networks but also decision trees and polynomials for example) which do not require strong typing (see chapters 7 and 9 of my book to learn how to encode polynomials and decision trees with numeric/mixed attributes in GEP).
 

Questions & Answers

***


Last update: 23/July/2013
 
© Candida Ferreira
All rights reserved.