## Genetic algorithms – alternative to building block hypothesis

Genetic algorithms and especially their subset, Genetic programming were always fascinating me. My interest was fueled by on and off work with Global optimization, and because GA just plain cool. One of the most interesting thing about GA is that they work quite good on some “practical” problems, while there is no comprehensive theoretical explanation why they should work so well (Of cause they are not always so useful. There was a work on generating feature descriptors with GA, and results were less then impressive).

Historically, first and most well known explanation for GA efficiency was the building block hypothesis. Building block hypothesis is very intuitive. It say that there are exist “building blocks” – small parts of genome with high fitness. GA work is randomly searching for those building blocks and combining them afterward, until global optimum is found. Searching is mostly done with mutation, and combining found building block with crossover (analog of exchange of genetic material in real biological reproduction).

However building blocks have a big problem, and that problem is crossover operator. If building block hypothesis is true, GA work better if integrity of building blocks preserved as much as possible. That is there should be only few “cut and splice” points in the sequence. But practically GA with “uniform” crossover – massive uniform mixing of two genomes, work better then GA with few crossover points.

Recently a new theory of GA efficiency appears, that try to deal with uniform crossover problem – Generative fixation” hypothesis. The idea behind “generative fixation” is that GA works in continuous manner, fixing stable groups of genes with high fitness and continuing search on the rest of genome, reducing search space step by step. From optimization point of view GA in that case works in manner similar to Conjugate gradient method, reducing (or trying) dimensionality of search space in each step. Now about “uniform crossover” – why it works better: subspace, to which search space reduced, should be stable (in stability theory sense). Small permutations wouldn’t case solution to diverge. With uniform crossover of two close solutions resulting solution still will be nearby attractive subspace. The positive effect of uniform crossover is that it randomize solution, but without exiting already found subspace. That randomization clearing out useless “stuck” genes (also called “hitchhikers”), and help to escape local minima.

Interesting question is, what if subspace is not “fixed bits” and even not linear – that is if it’s a manifold. In that case (if hypothesis true) found genes will not be “fixed”, but will “drift” in systematic manners, according to projection of manifold on the semi-fixed bits.

Now to efficiency GA for “practical” task. If the “generative fixation” theory is correct, “practical” task, for which GA work well could be the problems for which dimensionality reduction is natural, for example if solution belong to low-dimensional attractive manifold. (addenum 7/11)That mean GA shouldn’t work well for problem which allow only combinatorial search. Form this follow that if GA work for compressed sensing problem it should comply with Donoho-Tanner Phase Transition diagram.

Overall I like this new hypothesis, because it bring GA back to family of mathematically natural optimization algorithms. That doesn’t mean the hypothesis is true of cause. Hope there will be some interest, more work, testing and analysis. What is clear that is current building block hypothesis is not unquestionable.

PS 7/11:

Simple googling produced paper by Beyer An Alternative Explanation for the Manner in which Genetic Algorithms Operate with quite similar explanation how uniform crossover works.