## From financial crisis to image processing: Ignore Topology At Your Own Risk.

Very interesting article in Wired Recipe for Disaster: The Formula That Killed Wall Street . I’m not a statistician, but I’ll try to explain it. The gist of the article is that in the heart of the current financial crisis is the David X. Li formula, which use “Gaussian copula function” for risk estimation. The idea of formula is that if we have to estimate joint probability of two random events, it could be done with simple formula, which use only probability distributions of each event as if they were independent and a single parameter – statistical correlation. So what bankers did – instead of looking into relationships and connections between events they just did calculate one single statistical parameter and used it for risk estimation. Even more – they applied the same formula to the results of those relatively simple calculations and build pyramids of estimations, each next step applying the same simple formula to results of the previous step. As a result, an extremely complex behavior was reduced to the simple linear model, which had little in common with reality.

And now – the illustration from wiki, what exactly this single parameter – correlation is:

Here are several two-variable distributions and their correlation coefficients. It could be seen that for linear relationships correlation capture dependence of variables perfectly (middle raw). For upper row – normal distributions – it capture the essence of dependency. We can say something about other variable if we know one variable and correlation in that case. For complex shapes – lower row – correlation is zero for each. Each of the lower shapes will be represented as the upper central shape (fuzzy ball) with correlation. Correlation capture nil information about how one variable depend on another for the lower shapes. Correlation allow representation of any shape only as fuzzy ellipse. Li’s formula reduce dimensionality. The thing is, dimensionality – topological property, and you don’t mess with topological properties easily. Imagine bankers using fuzzy ball instead of ring for risk estimation…

Now to the image processing. Most of feature detection in image processing is done for grayscale image. Original image is usually RGB, but before features extraction it converted to grayscale.

However the original image is colored, why not to use colors for feature detection ? For example detect features in each color channel separately?

The thing is, the pictures in each color channel are very similar.

The extraction of blobs in each channel in most cases will triple the job without gaining of significant new information – all the channels will give about the same blobs.

Nevertheless it’s obvious, there is some nontrivial information about the image, encoded in colors.

Why blob detection for each color don’t give access to it ?

The reason is the same as for current financial crisis – dimensionality. Treating each color channel separately we replace five-dimensional RGB+coordinates space with three three-dimensional color+coordinates spaces. Relationships between color channels are lost. Topology of color structure is lost.

To actually use color information, statistical relationships between colors of the image should be explored – something like three dimensional color bins histogram, essentially converting image from RGB to indexed color.

## Markerless tracking with FAST

Testing outdoor markerless tracking with FAST/SURF feature detector.

The plane of the camera is not parallel to the earth, that make difficult for eye to estimate precision.

## FAST with SURF descriptor

Feature detected with multistage FAST and fitted with SURF descriptors

Less strict threshold give a lot more correspondences, but also some false positives

## Multiscale FAST detector

Experimenting with multiscale FAST detector with images from cell phone camera.

so far so good…

## Testin FAST feature detector

Testing FAST feature detector on the Mikolajczyk ’s dataset. Here scale space seems actually useful. With “brick wall” dataset repeatability goes form .3 to .7 with scale from 0 t 2^^3, and threshold/barrier lowering from 40 to 20.

## From SURF to FAST

I did some research on the SURF optimization. While it still possible to make it significantly faster with lazy evaluation, the problem of the scale remain. Fine-scale features are not detectedable on the bigger scale, so it doesn’t look like there is an easy way to reduce search area using only upper scale. If scale-space is not helping to reduce search area it become liability for mobile tracking – range can’t change too fast for a mobile pone, so scale of the feature will be about the same between frames.

Will try plain, not scale-space corner detectors now, starting with FAST.

## SURF scale scpace

I continue to test SURF, in respect to scale space. Scale space is essentially a pyramid of progressively more blurred or lower resolution images. The idea of scale invariant feature detection is that the “real” feature should be present at several scales – that is should be clearly detectable at several image resolution/blur levels. The interesting thing I see is, that for SURF, at least for test images from Mikolajczyk ‘s dataset, scale space seems doesn’t affect detection rate with viewpoint change. I meant that there is no difference if feature distinct in several scales or only in one. That’s actually reasonable – scale space obviously benefit detection in the blurred images, or noisy images, or repeatability/correspondence in scaled images , and “viewpoint” images form Mikolajczyk ‘s dataset are clear, high resolution and about the same scale. Nevertheless there is some possibility for optimization here.

## Testing SURF-based detectors

I have tested several modification of SURF, using original SURF Hessian, extremum of SURF-based Laplacian, Hessian-Laplace – extremum both Hessian and Laplacian and minimal eigenvalue of Hessian. They all give about the same detection rate, but original SURF Hessian give better results. Minimal eigenvalue of Hessian seems better scaling with threshold value – original Hessian absolute value could be very low, but eigenvalues are not. So this approach may have some advantage if there are potential precision loss problem, for example in fixed point calculations. A lot of high-end mobile phones still launched without hardware floating point so it still could be useful in AR or Computer Vision applications.

## Markerless tracking

I’ve started experimenting with markerless tracking. I’ve captured several cityscape image sequences and processed them with SURF detector. I’ve used Nokia N95 viewfinder frames. Here descriptors were oriented:

There are some corresponding features detected in both images, but their descriptors arn’t fit.

Interesting, upright, not oriented descriptors give little different picture – some new correspondences found, some lost.