One of the problem with bundle adjustment is multiple local minimums. If initial approximation is not good enough solution could converge to wrong minima. If this problem arise global optimization should be used. There are several branch and bound bundle adjustment methods for it, which use fractional programming.
Though it’s usually possible to choose correct minima with some geometric consistency check or additional information, I’m trying to understand this situation better. I’ve tried to visualize reprojection error distribution for 2-frame bundle adjustment.
Those pictures represent dependence of reprojection error on the values of second camera position and rotation relatively to first camera.
First, it’s interesting to see how minimal error depend on translational parameter, with fixed rotational parameters. Here 3d structure factored out and translation parametrized with first frame epipole position. Here we see reprojection error depending on the epipole of the first frame only.
This picture easy to understand. “Turbulent” area in the center – the situation where epipole is close to projections of the points. Black tails – area where solution – epipole which minimize reprojection error are situated. It could be seen that epipole have “preferred” direction – the direction from the coordinate center to epipole is more important than distance from center to epipole. There are to “tails” because epipole pass through infinity. This picture also shows that coordinate descent could be effective for factoring out translation, with descent first by direction and second by distance. This wouldn’t work if epipole is near center. The situation where epipole is near center correspond translation of the second camera by mostly Z-axis , but in that case bundle adjustment is not robust anyway.
Now to more complicated problem – visualizing reprojection error parametrized with rotational parametres. 3D structure is factored out as in the first example , and using insight we got in the first example translation parameter approximately factored out with coordinate descent. Two rotational parameters are correspond to projection of the normal to the camera plane of the second camera on the first camera plane. Third parameter (rotation around that normal) is fixed with initial approximation.
Here we see some edgelike artifacts caused by imperfect factoring out translation.
It really should be more smooth, something like that – in this picture epipole distance from the center fixed in infinity. That make picture more smooth but less correct.
Returning to the first rotational parameter example – green pixels mark local minima of the approximation, blue and light blue circles mark two minima to which bundle adjustment actually converge. They dont’ fit approximation exactly, due to error of factoring out epiploes.
What can we see that picture? The local minimums are situated inside connected areas, which generally can be represented as ellipses only poorly (one area is more like spiral). That explain why quadratic methods(Gauss–Newton, Levenberg–Marquardt) are not always work efficiently for bundle adjustment.
The interesting thing is that all area approximately connect in some kind of X-shaped center, where reprojection error locally maximal. I have seen this behavior on other examples too. Right now I don’t understand completely why it happens and what is the nature of this “center”. If this is universal property and this “center” can be efficiently located that effect could be useful.
With multiple frame bundle adjustment situation could be different, but it’s a lot more difficult both to visualize and calculate.
Here are original camera frames, on which bundle adjustment is executed.
Threr is such thing as Milgram’s Reality-Virtuality Continuum
Milgram’s continuum shows progress of interface from raw environment to completely synthetic environment.
It looks like it’s possible to add another dimension to this picture. There exists a concept of “Tangible Space” in AR. “Tangible Space” basically mean that user can interact with real-world objects and those actions affect virtual environment. For example AR game which use real world objects as part of gameplay, track positions and any changes of state of those objects. Essentially “Tangible Space” is virtual wrapping around real-world interaction.
However that line of thought could be stretched beyond augmented reality. In the “Tangible Space” real-world interaction affect virtual environment. What if virtual interaction affect real-world environment? In that case we would have “Intelligent Space”, or iSpace.
Based on DIND – Distributed Intelligent Networked Device. iSpace is an augmented(or virtual) reality environment “augmented” with mobile robots and/or other actuators. Intelligent network now not only track physical environment, but also actively interact with it using physical agents. If Augmented Reality is an extension of eye, Intelligent Space is an extension of both eye and hands. Not only real environment is a part of interface now(as in “Tangible Space”) , it now actively help human to perform some task, and also should guess how to do it. Human and robots are now integrated system, something like distributed exoskeleton.
Now we have a new dimension for Milgram’s Continuum:
Passive View of Real Environment->Augmented Reality->Tangible Space->Intelligent Space
If you remember Vernor Vinge’s “Rainbow Ends”, the enviroment in it not just an Augmented Reality – it’s an Intelligent Space