## Visualizing Bundle Adjustment

One of the problem with bundle adjustment is multiple local minimums. If initial approximation is not good enough solution could converge to wrong minima. If this problem arise global optimization should be used. There are several branch and bound bundle adjustment methods for it, which use fractional programming.

Though it’s usually possible to choose correct minima with some geometric consistency check or additional information, I’m trying to understand this situation better. I’ve tried to visualize reprojection error distribution for 2-frame bundle adjustment.

Those pictures represent dependence of reprojection error on the values of second camera position and rotation relatively to first camera.

First, it’s interesting to see how minimal error depend on translational parameter, with fixed rotational parameters. Here 3d structure factored out and translation parametrized with first frame epipole position. Here we see reprojection error depending on the epipole of the first frame only.

This picture easy to understand. “Turbulent” area in the center – the situation where epipole is close to projections of the points. Black tails – area where solution – epipole which minimize reprojection error are situated. It could be seen that epipole have “preferred” direction – the direction from the coordinate center to epipole is more important than distance from center to epipole. There are to “tails” because epipole pass through infinity. This picture also shows that coordinate descent could be effective for factoring out translation, with descent first by direction and second by distance. This wouldn’t work if epipole is near center. The situation where epipole is near center correspond translation of the second camera by mostly Z-axis , but in that case bundle adjustment is not robust anyway.

Now to more complicated problem – visualizing reprojection error parametrized with rotational parametres. 3D structure is factored out as in the first example , and using insight we got in the first example translation parameter approximately factored out with coordinate descent. Two rotational parameters are correspond to projection of the normal to the camera plane of the second camera on the first camera plane. Third parameter (rotation around that normal) is fixed with initial approximation.

Here we see some edgelike artifacts caused by imperfect factoring out translation.

It really should be more smooth, something like that – in this picture epipole distance from the center fixed in infinity. That make picture more smooth but less correct.

Returning to the first rotational parameter example – green pixels mark local minima of the approximation, blue and light blue circles mark two minima to which bundle adjustment actually converge. They dont’ fit approximation exactly, due to error of factoring out epiploes.

What can we see that picture? The local minimums are situated inside connected areas, which generally can be represented as ellipses only poorly (one area is more like spiral). That explain why quadratic methods(Gauss–Newton, Levenberg–Marquardt) are not always work efficiently for bundle adjustment.

The interesting thing is that all area approximately connect in some kind of X-shaped center, where reprojection error locally maximal. I have seen this behavior on other examples too. Right now I don’t understand completely why it happens and what is the nature of this “center”. If this is universal property and this “center” can be efficiently located that effect could be useful.

With multiple frame bundle adjustment situation could be different, but it’s a lot more difficult both to visualize and calculate.

Here are original camera frames, on which bundle adjustment is executed.

## Features matching and geometric consistency.

Here I want to talk about matching in image registration. We are doing registration in 3D or 2D, and using feature points for that. Next stage after extraction of feature points from the image is finding corresponding points in two(or more) images. Usually it’s done with descriptors, like SIFT, SURF, DAISY etc. Sometimes randomized trees are used for it. Whatever methods is used it usually has around .5% of false positives. False positives create outliers in registration algorithm. That is not a big problem in planar trackers or model/marker trackers. It could be a problem for Structure From Motion though. If CPU power is not limited the problem is not very serious. Heavy-duty algorithms like full-sequence bundle adjustment and RANSAC cope with outliers pretty well. However even for high-end mobile phones such algorithms are problematic. Some tricks can help – Georg Klein put full-sequence bundle adjustment into separate thread on PTAM tracker to run asynchronously, but I’m trying to do local, 2-4 frames bundle adjustment here. The problem of false positives is especially difficult for images of patterned environment, where some image parts are similar or repeated.

Here mismatched correspondence marked with blue line (points 15-28).

As you can see it’s not easy for any descriptor to tell the difference between points 13(correct) and 15(wrong) on the left image – their neighborhood is practically the same:

Such situations could easily happen not only indoor, but also in cityscape, industrial, and others regular environments.

One solution for such cases is to increase descriptor radius, to process a bigger patch around the point, but that would create problems of its own, for example too much false negatives.

Other approach is to use geometric consistency of the image points positions.

There are at least two ways to do it.

One is to consider displacements of corresponding points between frames. Here is example from paper by Kanazawa et al “Robast Image Matching Preserving Global Geometric Consistency”

This method first gathering local displacement statistic around each points, filter out outliers and and apply smoothing filter. Here are original matches, matches after applying consistency check and matches after applying smoothing filter.

However this method works best for dense, regular sets of feature points. For small, sparse set of points it does not improving situation much.

Here is a second approach. Build graph out of feature points for each frame.

Local topological structure of the two graphs is different because of false positives. It’s easy to find graph vertices/edges which cause inconsistency – edges marked blue.They can be found for example by signs of crossproducts between edges. After offending vertices found they are removed:

There are different ways to build graph out of feature points. Simplest is nearest neighbors, but may be Delaney triangulation or DSP can do better.

## Compressive Sensing and Computer Vision

Thanks to Igor Carron for pointing out this video lecture

Compressive Sensing for Computer Vision: Hype vs Hope

It start with comprehensible explanation of what compressive sensing is about (BTW wiki article on compressive sensing is wholly inadequate).

Basically it’s about imagining the lower-dimensional signal(image) as projection by rectangular matrix from mostly zero high-dimensional vector. It happens that this sparse high-dimensional vector can be restored if the matrix is almoste orthonormal (Restricted Isometry Property). Discrete Fourier Transform and random matrices have that property.

This sparse vector could be considered as classification space for original signal. So application of Compressive Sensing to Computer Vision is mostly about classification or recognition. As methods used by CS are convex and linear programming those are not run-time methods, and would not help much in real-time tracking. There is CS-inspired advise at the end of the lecture, about trying to replace norm optimization with norm. That could be actually helpful in some cases. If approximated as iteratively reweighted it’s essentially the same as robustification of least square method.

## Testing a new descriptor.

Trying a new descriptor, inspired by SURF and SIFT. Want to use gradient instead of Haar transforms of intensity, but with less dimensionality than SURF. Also don’t need rotation/scale invariance, because using incremental tracking.

## Switching to OpenCV 2.0 with VS2005

I’m using OpenCV for some tests, and for some reasons (freelance gigs and Symbian SDK) using MS Visual Studio. As new and shiny OpenCV 2.0 is out I decided to switch to it. As it happen, one absolutely have to read buried in the download section readme, before doing anything.

The thing is, OpenCV 2.0 doesn’t include lib files for VS. They have to be built by user.

So here is step by step retelling of readme:

1. Rename your old OpenCV installation to save it, just in case

2. Download and install OpenCV 2.0a

3. Download and install CMake

4. Reboot (or CMake wouldn’t work)

5. Go to C:\Program Files\CMake 2.6\bin and run cmake-gui.exe

6. In the “Where is the source code” field choose your new OpenCV directory (C:\OpenCV)

In “Where to build the binaries” choose directory for VS compiled OpneCV (C:\OpenCV\VS2005)

7. press Configure button and choose VS2005 (or whatever) as building enviroment

8. Press Generate and VS project will be generated in the C:\OpenCV\VS2005

9. Launch solution and build it. For debug build some projects require debug python libraries. As riseriyo pointed in comments if you have Python installed other than 2.6 that can cause problem.

10. Copy *.lib from C:\OpenCV\vs2005\lib\release (or debug) to C:\OpenCV\lib

Copy *.dll from C:\OpenCV\vs2005\bin\release to C:\OpenCV\bin

11. Now, reconfigure your application project. Include directories now “C:\OpenCV\include\opencv” instead of “C:\OpenCV\include

12. All libraries renamed from *.lib to *200.lib (cv.lib to cv200.lib) or *200d.lib for debug. Rename them, or change project settings.

PS if you need Python and still have a problem with cvpy:

From readme:

Known issues:

=============

1. Python 2.6 bindings for OpenCV are included within the package,

but not installed.

You can copy the subdirectory opencv/Python2.6/Lib/site-packages into

the respective directory of the Python installation.

Here is riseriyo explanation how he deal with python problem

PPS Comment by rise about vs2008 issue:

dll and manifest file version conflict in msvc 2008. the only way i was able to fix this was to completely uninstall msvc 2008 and then do a clean install w/o updating it with the sp1 packages.

see his blog and how he was troubleshooting (for days) the issue

PPPS As Niklas pointed out if you have omp.h not found error, that mean you forgot to turn off OpenMP in CMake.

That’s it. Project should compile now. If not you still have your old OpenCV installation

## Still checking Gauss-Newton

Though Levenberg-Marquardt works I’m still trying to save Gauss-Newton, especially as I’ve read paper saying that Gauss-Newton with dogleg trust-region works well for bundle adjustment. I’ll probably try direct substitution with Cholesky rank-1 update and constrained optimization.

## Solution – free gauge

Looks like the problem was not the large Gauss-Newton residue. The problem was gauge fixing.

Most of bundle adjustment algorithms are not gauge invariant inherently (for details check Triggs “Bundle adjustment – a modern synthesis”, chapter 9 “Gauge Freedom”). Practically that means that method have one or more free parameters which could be chosen arbitrary (for example scale), but which influence solution in non-invariant way (or don’t influence solution if algorithm is gauge invariant). Gauge fixing is the choice of the values for that free parameters. There exist at least one gauge invariant bundle adjustment method (generalization of Levenberg-Marquardt with complete matrix correction instead of diagonal only correction) , but it is order of magnitude more computational expensive.

I’ve used fixing coordinate of one of the 3d points for gauge fixing. Because method is not gauge invariant solution depend on the choice of that fixed point. The problem occurs when the chosen point is “bad” – error in feature point detector for this point is so big that it contradict to the rest of the picture. Mismatching in the point correspondence can cause the same problem.

In my case, fixing coordinate of chosen point caused “accumulation” of residual error in that point. This is easy to explain – other points can decrease reprojection error both by moving/rotating camera and by shifting their coordinates, but fixed point can do it only by moving/rotating camera. It looks like if the point was “bad” from the start it can become even worse next iteration as the error accumulate – positive feedback look causing method become unstable. That’s of cause only my observations, I didn’t do any formal analysis.

The obvious solution is to redistribute residual error among all the points – that mean drop gauge fixing and use free gauge. Free gauge is causing arbitrary scaling of the result, but the result can be rescaled later. However there is the cost. Free gauge means matrix is singular – not invertible and Gauss-Newton method can not work. So I have to switch to less efficient and more computationally expensive Levenberg-Marquardt. For now it seems working.

PS Free gauge matrix is not singular, just not well-defined and has degenerate minimum. So constrained optimization still may works.

PPS Gauge Invariance is also important concept in physics and geometry.

PPPS While messing with Quasi-Newton – it seems there is an error in chapter 10.2 of “Numerical Optimization” by Nocedal&Wright. In the secant equation instead of should be

## Bundle Adjustemnt on the Mars with Rover

Just found out – Mars Rovers used bundle adjustment for its localization and rocks modeling:

“Purpose of algorithm:

To perform autonomous long-range rover localization based on bundle adjustment (BA) technology.

Processing steps of the algorithm include interest point extraction and matching, intra- and inter- stereo tie point selection, automatic cross-site tie point selection by rock extraction, modeling and matching, and bundle adjustment”

## Marker vs markerless (bundle adjustment)

#augmentedreality

Here is a sample of image registration with fiduciary marker (actually the marker I used in my games) vs registration with bundle adjustment. Blue lines are points heights (relatively to marker plane) calculated using marker registration and triangulation. White lines are the same using bundle adjustment(modified). Points extracted with multiscale FAST and fitted with log-polar Fourier descriptors for correspondence (actually SURF descriptor produce the same correspondence).

As you can see markerless is in no way worse then markers, at least on this example ))).

## Video Surveillance is Useless

Found this interesting slide presentation form Peter Kovesi, inventor of phase congruency edge detector. It basically saying, that on current tech level video surveillance is useless for face identification. What follow is that it’s actually harmful, due to wrong impression of it’s reliability.

Also on his page – some fun animation or How to Animate Impossible Objects

PS Fourier phase approach to feature detection looks really promising, especially if to find some low computation cost modification.