Mirror Image

Mostly AR and Stuff

Still checking Gauss-Newton

Though Levenberg-Marquardt works I’m still trying to save Gauss-Newton, especially as I’ve read paper saying that Gauss-Newton with dogleg trust-region works well for bundle adjustment. I’ll probably try direct substitution with Cholesky rank-1 update and constrained optimization.

13, October, 2009 Posted by mirror2image | Coding AR, computer vision | , , , | No Comments Yet

Solution – free gauge

Looks like the problem was not the large Gauss-Newton residue. The problem was gauge fixing.
Most of bundle adjustment algorithms are not gauge invariant inherently (for details check Triggs “Bundle adjustment – a modern synthesis”, chapter 9 “Gauge Freedom”). Practically that means that method have one or more free parameters which could be chosen arbitrary (for example scale), but which influence solution in non-invariant way (or don’t influence solution if algorithm is gauge invariant). Gauge fixing is the choice of the values for that free parameters. There exist at least one gauge invariant bundle adjustment method (generalization of Levenberg-Marquardt with complete matrix correction instead of diagonal only correction) , but it is order of magnitude more computational expensive.
I’ve used fixing coordinate of one of the 3d points for gauge fixing. Because method is not gauge invariant solution depend on the choice of that fixed point. The problem occurs when the chosen point is “bad” – error in feature point detector for this point is so big that it contradict to the rest of the picture. Mismatching in the point correspondence can cause the same problem.
In my case, fixing coordinate of chosen point caused “accumulation” of residual error in that point. This is easy to explain – other points can decrease reprojection error both by moving/rotating camera and by shifting their coordinates, but fixed point can do it only by moving/rotating camera. It looks like if the point was “bad” from the start it can become even worse next iteration as the error accumulate – positive feedback look causing method become unstable. That’s of cause only my observations, I didn’t do any formal analysis.
The obvious solution is to redistribute residual error among all the points – that mean drop gauge fixing and use free gauge. Free gauge is causing arbitrary scaling of the result, but the result can be rescaled later. However there is the cost. Free gauge means matrix is singular – not invertible and Gauss-Newton method can not work. So I have to switch to less efficient and more computationally expensive Levenberg-Marquardt. For now it seems working.
PS Free gauge matrix is not singular, just not well-defined and has degenerate minimum. So constrained optimization still may works.
PPS Gauge Invariance is also important concept in physics and geometry.
PPPS While messing with Quasi-Newton – it seems there is an error in chapter 10.2 of “Numerical Optimization” by Nocedal&Wright. In the secant equation instead of S_{k+1}(x_{k+1} - x_{k}) = J^{T}_{k+1}r_{k+1} - J^{T}_{k}r_{k} should be S_{k+1}(x_{k+1} - x_{k}) = J^{T}_{k+1}r_{k+1} - J^{T}_{k}r_{k+1}

11, October, 2009 Posted by mirror2image | Coding AR, computer vision | , , , , , | No Comments Yet

Problems

During the tests I’ve found out that bundle adjustment is failing on some “bad frames”. There two ways to deal with it – reject bad frames or try to understand what happen – who set up us a bomb? :-) .Any problem is also an opportunity to understand subject better. For now I suspect Gauss-Newton is failing due to too big residue. Just adding Hessian to J^{T}J does not help – I’m getting negative eigenvalue. So now I’m trying quasi-Newton from the excellent book by Nocedal&Wright. If it will not help I’ll try hybrid Fletcher method.

PS It looks like the problem was not the large residue

6, October, 2009 Posted by mirror2image | Coding AR, Uncategorized | , , , , , | No Comments Yet

What’s going on

Code of markerless tracker is finished for emulator. It’s in in minimal configuration, without some optimizations, bell and whistles like combined points-edge pose estimation for now. Now it’s bugs squashing and testing with different video feeds for some times. Modified bundle adjustment is the nicest part, seems pretty stable and robust.

15, September, 2009 Posted by mirror2image | Coding AR | , , , , | 2 Comments

Augmented reality, enforced locality, geometric hashing

I had discussion with Lester Madden at linkedin MAR group. The thing we discussed was the concept of the locality in the AR. That is, each AR object should be attached to specific location and accessible only from that location.
I’ll try explain it more in depth here.
Augmented graffiti, augmented reality mail/drop boxes and billboards, user-built reality overlays – all of those should be attached to specific location. This locality could be enforced – only local data would be available (filtered into) in the specific location. This locality of data prevent user from sinking in the augmented noise, generated all other the world, and reduce possibility of spam.
For example you can have neighborhood billboard, leave note for the friends in the park and so on. All those AR objects data could be accessed only locally for both read and write – to read billboard and to post a message on it you would have to go to it.
The user should get the data/content only if he is physically present at the specific location. The same way poster/producer of the data or AR object should physically visit each location where it placed.
If locality is enforced, to place note for your friend in the park you have to visit park, and there is no way around it.
Locality could be enforced with location-based encryption. I think this encryption could be made with use of geometric hashing. User scan environment and make 3d registration with his mobile or wearable device. Encryption key is generated by mobile device from the scanned 3d model of the environment.
If user want to get data attached to the location, he access the server, retrieve local data and decrypt them with that key.
In the opposite direction, if user want to attach some object or data to location, mobile device encrypt data with part of the hash key and send other part of the key to server. Before storing data the server do uniqueness check. Nearby data already stored on the server are checked, and the new data allowed in only if there is some distance from new key to keys of all the other stored data. After that new data encrypted with the second part of the key by server and stored.
Each object encrypted by two keys, one of which is server side. Server have no access to content of the data, but have access to the part of the location hash key. That way no two objects or data attached to exactly the same location. Clattering of AR objects could be reduced. More importantly if poster have to physically visit location where he want to place AR object, he should have at least some relation to that location, and he is not some spammer from the other end of the world.
If spammer forge location key without actually visiting the place, that will most probably be non-existing location, and no one will be hit by his data.
That all is of cause is a rough outline of how could enforced locality works. Building robust algorithm for extracting geometric hash could be non-trivial.

1, May, 2009 Posted by mirror2image | Augmented Reality | , , , , | 8 Comments

Why 3d markerless tracking is difficult for mobile augmented reality

I often hear sentiments from users that they don’t like markers, and they are wondering, why there are so relatively few markerless AR around. First I want to say that there is no excuse for using markers in the static scene with immobile camera, or if desktop computer is used. Brute force methods for tracking like bundle adjustment and fundamental matrix are well developed and used for years and years in the computer vision and photogrammetry. However those methods in their original form could hardly produce acceptable frame rate on the mobile devices. From the other hand marker trackers on mobile devices could be made fast, stable and robust.
So why markers are easy and markerless are not ?
The problem is the structure , or “shape” of the points cloud generated by feature detector of the markerless tracker. The problem with structure is that depth coordinate of the points is not easily calculated. That is even more difficult because camera frame taken from mobile device have narrow baseline – frames taken form position close one to another, so “stereo” depth perception is quite rough. It is called structure from motion problem.
In the case of the marker tracker all feature points of the markers are on the same plane, and that allow to calculate position of the camera (up to constant scale factor) from the single frame. Essentially, if all the points produced by detector are on the same plane, like for example from the pictures lying on the table, the problem of structure from motion goes away. Planar cloud of point is essentially the same as the set of markers – for example any four points could be considered as marker and the same algorithm could apply. Structure from motion problem is why there is no easy step from “planar only” tracker to real 3d markerless tracker.
However not everything is so bad for mobile markerless tracker. If tracking environment is indoor, or cityscape there is a lot of rectangles, parallel lines and other planar structures around. Those could be used as initial approximation for one the of structure from motion algorithm, or/and as substitutes for markers.
Another approach of cause is to find some variation of structure from motion method which is fast and works for mobile. Some variation of bundle adjustment algorithm looks most promising to me.
PS PTAM tracker, which is ported to iPhone, use yet another approach – instead of using bundle adjustment for each frame, bundle adjustment is running in the separate thread asynchronously, and more simple method used for frame to frame tracking.

30, March, 2009 Posted by mirror2image | Coding AR | , , , , , , , , | 4 Comments

Tracking planes in the city

In relation to tracking cityscape I did some planar segmentation test. Segmented FAST generated corners with simple 5-points projective invariant.
In some cases 5-point give some rough approximation:
planar segments
In some cases outliers are quite bad – some point have very close projective invariant but still are in diffferent planes.
bad seggment
So simple method not quite work…

19, March, 2009 Posted by mirror2image | Coding AR, computer vision | , , , , , , , , , | 4 Comments

Tracking cityscape

One of the big problem in image registration/structure from motion/3d tracking is using global information of the image. Feature/blob extraction, like SIFT, SURF or FAST etc using only local information around the point. Region detector like MSER using area information, but MSER is not good at tracking textures, and not quite stable at complex scenes. Edge detection provide some non-local information, but require processing edges. That could be computationally heavy, but looks promising anyway. There are a lot of methods which use global information – all kind of texture segmentation, epitome, snakes/appearance models, but those are computationally heavy and not suitable for mobiles. The question is how to incorporate global information from the image into tracker, and make it with minimal amount of operations. One way is to optimise tracker for specific environment – for example use the property of cityscape, a lot of planar structures and straight lines. Such multiplanar tracker wouldn’t work in the forest or park, but could be a working compromise.

12, March, 2009 Posted by mirror2image | Coding AR | , , , , , , , , , , , , | No Comments Yet

Region Tracking

Experimenting with MSER region tracking
The problem is that regions seems not stable enough and way too big.
MSER of downsampled images, original images, MSER of mean shift filtered images, MSER of smoothed images:
mser
Mean shift filtering seems capture more feauters, but it’s too computationally expensive.

3, March, 2009 Posted by mirror2image | Coding AR | , , , , , , | No Comments Yet

Markerless tracking with FAST

Testing outdoor markerless tracking with FAST/SURF feature detector.
The plane of the camera is not parallel to the earth, that make difficult for eye to estimate precision.
registration

29, January, 2009 Posted by mirror2image | Uncategorized | , , , , , , , , | No Comments Yet