Mirror Image

Genetic algorithms – alternative to building block hypothesis

Genetic algorithms and especially their subset, Genetic programming were always fascinating me. My interest was fueled by on and off work with Global optimization, and because GA just plain cool. One of the most interesting thing about GA is that they work quite good on some “practical” problems, while there is no comprehensive theoretical explanation why they should work so well (Of cause they are not always so useful. There was a work on generating feature descriptors with GA, and results were less then impressive).
Historically, first and most well known explanation for GA efficiency was the building block hypothesis. Building block hypothesis is very intuitive. It say that there are exist “building blocks” – small parts of genome with high fitness. GA work is randomly searching for those building blocks and combining them afterward, until global optimum is found. Searching is mostly done with mutation, and combining found building block with crossover (analog of exchange of genetic material in real biological reproduction).
However building blocks have a big problem, and that problem is crossover operator. If building block hypothesis is true, GA work better if integrity of building blocks preserved as much as possible. That is there should be only few “cut and splice” points in the sequence. But practically GA with “uniform” crossover – massive uniform mixing of two genomes, work better then GA with few crossover points.
Recently a new theory of GA efficiency appears, that try to deal with uniform crossover problem – Generative fixation” hypothesis. The idea behind “generative fixation” is that GA works in continuous manner, fixing stable groups of genes with high fitness and continuing search on the rest of genome, reducing search space step by step. From optimization point of view GA in that case works in manner similar to Conjugate gradient method, reducing (or trying) dimensionality of search space in each step. Now about “uniform crossover” – why it works better: subspace, to which search space reduced, should be stable (in stability theory sense). Small permutations wouldn’t case solution to diverge. With uniform crossover of two close solutions resulting solution still will be nearby attractive subspace. The positive effect of uniform crossover is that it randomize solution, but without exiting already found subspace. That randomization clearing out useless “stuck” genes (also called “hitchhikers”), and help to escape local minima.
Interesting question is, what if subspace is not “fixed bits” and even not linear – that is if it’s a manifold. In that case (if hypothesis true) found genes will not be “fixed”, but will “drift” in systematic manners, according to projection of manifold on the semi-fixed bits.
Now to efficiency GA for “practical” task. If the “generative fixation” theory is correct, “practical” task, for which GA work well could be the problems for which dimensionality reduction is natural, for example if solution belong to low-dimensional attractive manifold. (addenum 7/11)That mean GA shouldn’t work well for problem which allow only combinatorial search. Form this follow that if GA work for compressed sensing problem it should comply with Donoho-Tanner Phase Transition diagram.
Overall I like this new hypothesis, because it bring GA back to family of mathematically natural optimization algorithms. That doesn’t mean the hypothesis is true of cause. Hope there will be some interest, more work, testing and analysis. What is clear that is current building block hypothesis is not unquestionable.

PS 7/11:
Simple googling produced paper by Beyer An Alternative Explanation for the Manner in which Genetic Algorithms Operate with quite similar explanation how uniform crossover works.

6, November, 2010 Posted by mirror2image | sci | AI, genetic algorothms, Math, Science, Technology | 1 Comment

Recursive Interferometry – Phase Congruency?

Thanks to Igor Carron I’ve watched a great videolecture by Stephane Mallat High dimensional classification by recursive interferometry. Actually I watched it twice, and I think I understand most of it now))). And it was not about compressed sensing, not even about manifold learning much . It was mostly about a new application of wavelets . How to use wavelets to produce low dimensional data (image descriptor if we are talking about computer vision) from high dimensional data(that is image). The idea is to transcend linear representation and use nonlinear operation – absolute value of wavelet. Absolute value – square root of wavelet square carry information of frequencies differences. It’s invert Fourier transform have new harmonics – differences of frequencies of original function. That interference of harmonics of original image. Now it was reminding me something. Yep – phase congruency (pdf). Phase congruency also use absolute value of wavelet(windowed Fourier). It seems to me it has perfect explanation. Interference pattern defined by how in-phase both wave are. That is it’s like a phase congruency taken into each point. Phase congruency edge-detector is in fact finding maximum of somehow normalized interference pattern. In that sense this Mallat’s method producing invariants from high-dimensional data is analogous to producing sketch from photo.
Ok, enough rambling for now.

30, September, 2010 Posted by mirror2image | computer vision | computer vision, feature detection, Fourier, Math | 1 Comment

Is Ouidoo a rebranded Zii Egg?

There ii some buzz on the net about a new smartphone specifically designed for augmented reality – Ouidoo. Some of the specs of the device sound outlandish – 26 CPUs, 8 GFLOPS and a new OS.
However one commenter on the cn.engadged.com pointed out that this device could actually be real – it specs resemble that of the new Creative Zii Egg OEM platform, which seems available for developers now. Zii Egg feature a new OS -“Plasma OS”, but also support Android. So what’s the deal with 26 CPUs?
Those are not general-propose CPUs of cause, but Zii Egg have some kind of vector processor (like in Cell processor or modern GPUs, some high-end smartphones also have it) with 24 floating point units. Add to this CPU itself – ARM Coretx8 and possibly dedicated GPU and you have 24+1+1 = 26. Also specs claimed by Creative for Zii Egg floating point vector processor – 8 GFLOPS are exactly those clamied by QderoPateo for Ouidoo.

25, April, 2010 Posted by mirror2image | Uncategorized | 1 Comment

SMMT is now open sourced

SMMT is now open sourced under BSD-like license. Sourceforge page is here.

21, February, 2010 Posted by mirror2image | Coding AR | Augmented Reality, C++, computer vision, fiduciary marker, Image Registration, Symbian | 4 Comments

Visualizing Bundle Adjustment

One of the problem with bundle adjustment is multiple local minimums. If initial approximation is not good enough solution could converge to wrong minima. If this problem arise global optimization should be used. There are several branch and bound bundle adjustment methods for it, which use fractional programming.
Though it’s usually possible to choose correct minima with some geometric consistency check or additional information, I’m trying to understand this situation better. I’ve tried to visualize reprojection error distribution for 2-frame bundle adjustment.
Those pictures represent dependence of reprojection error on the values of second camera position and rotation relatively to first camera.
First, it’s interesting to see how minimal error depend on translational parameter, with fixed rotational parameters. Here 3d structure factored out and translation parametrized with first frame epipole position. Here we see reprojection error depending on the epipole of the first frame only.

This picture easy to understand. “Turbulent” area in the center – the situation where epipole is close to projections of the points. Black tails – area where solution – epipole which minimize reprojection error are situated. It could be seen that epipole have “preferred” direction – the direction from the coordinate center to epipole is more important than distance from center to epipole. There are to “tails” because epipole pass through infinity. This picture also shows that coordinate descent could be effective for factoring out translation, with descent first by direction and second by distance. This wouldn’t work if epipole is near center. The situation where epipole is near center correspond translation of the second camera by mostly Z-axis , but in that case bundle adjustment is not robust anyway.

Now to more complicated problem – visualizing reprojection error parametrized with rotational parametres. 3D structure is factored out as in the first example , and using insight we got in the first example translation parameter approximately factored out with coordinate descent. Two rotational parameters are correspond to projection of the normal to the camera plane of the second camera on the first camera plane. Third parameter (rotation around that normal) is fixed with initial approximation.

Here we see some edgelike artifacts caused by imperfect factoring out translation.
It really should be more smooth, something like that – in this picture epipole distance from the center fixed in infinity. That make picture more smooth but less correct.

Returning to the first rotational parameter example – green pixels mark local minima of the approximation, blue and light blue circles mark two minima to which bundle adjustment actually converge. They dont’ fit approximation exactly, due to error of factoring out epiploes.
What can we see that picture? The local minimums are situated inside connected areas, which generally can be represented as ellipses only poorly (one area is more like spiral). That explain why quadratic methods(Gauss–Newton, Levenberg–Marquardt) are not always work efficiently for bundle adjustment.
The interesting thing is that all area approximately connect in some kind of X-shaped center, where reprojection error locally maximal. I have seen this behavior on other examples too. Right now I don’t understand completely why it happens and what is the nature of this “center”. If this is universal property and this “center” can be efficiently located that effect could be useful.
With multiple frame bundle adjustment situation could be different, but it’s a lot more difficult both to visualize and calculate.
Here are original camera frames, on which bundle adjustment is executed.

20, February, 2010 Posted by mirror2image | Coding AR | bundle adjustment, computer vision, Image Registration, Markerless tracking | 2 Comments

TI demoed tablet with stereocamera

In relation to this post TI demoed OMAP3 tablet with dual camera capable of recording 3d images.
TI promise dual core OMAP4 will be even better at this.

18, February, 2010 Posted by mirror2image | Augmented Reality, Mobile | Augmented Reality, Image Registration, Mobile, OMAP3, OMAP4, TI | Comments Off on TI demoed tablet with stereocamera

Augmented reality: from Tangible Space to Intelligent Space

Threr is such thing as Milgram’s Reality-Virtuality Continuum
Milgrams continuum
Milgram’s continuum shows progress of interface from raw environment to completely synthetic environment.
It looks like it’s possible to add another dimension to this picture. There exists a concept of “Tangible Space” in AR. “Tangible Space” basically mean that user can interact with real-world objects and those actions affect virtual environment. For example AR game which use real world objects as part of gameplay, track positions and any changes of state of those objects. Essentially “Tangible Space” is virtual wrapping around real-world interaction.
However that line of thought could be stretched beyond augmented reality. In the “Tangible Space” real-world interaction affect virtual environment. What if virtual interaction affect real-world environment? In that case we would have “Intelligent Space”, or iSpace.
DIND
Based on DIND – Distributed Intelligent Networked Device. iSpace is an augmented(or virtual) reality environment “augmented” with mobile robots and/or other actuators. Intelligent network now not only track physical environment, but also actively interact with it using physical agents. If Augmented Reality is an extension of eye, Intelligent Space is an extension of both eye and hands. Not only real environment is a part of interface now(as in “Tangible Space”) , it now actively help human to perform some task, and also should guess how to do it. Human and robots are now integrated system, something like distributed exoskeleton.
Now we have a new dimension for Milgram’s Continuum:
Passive View of Real Environment->Augmented Reality->Tangible Space->Intelligent Space
If you remember Vernor Vinge’s “Rainbow Ends”, the enviroment in it not just an Augmented Reality – it’s an Intelligent Space