I seldom post in the this blog now, mostly because I’m positing on twitter and G+ a lot lately. I still haven’t figured out which post should go where – blog, G+ or twitter, so it’s kind of chaotic for now.
What of interest is going on: There are two paper on the CVPR11 which claim that compressed sensing(sparse recovery) is not applicable to some of the most important computer vision tasks:
Is face recognition really a Compressive Sensing problem?
Are Sparse Representations Really Relevant for Image Classification?
Both paper claim that space of the natural images(or their important subsets) are not really sparse.
Those claims however dont’t square with claim of high effectiveness of compact signature of Random Ferns.
Could both of those be true? In my opinion – yes. Difference of two approaches is that first two paper assumed explicit sparsity – that is they enforced sparsity on the feature vector. Compressed signature approach used implicit sparsity – feature vector underling the signature is assumed sparse but is not explicitly reconstructed. Why compressed signature is working while explicit approach didn’t? That could be the case if image space is sparse in the different coordinate system – that is here one is dealing with the union of subspaces. Assumption not of the simple sparsity, but of the union of subspaces is called blind compressed sensing.
Now if we look at the space of the natural images it’s easy to see why it is low dimensional. Natural image is the image of some scene, an that scene has limited number of moving object. So dimension of space images of the scene is approximately the sum of degree of freedom of the objects(and camera) of the scene, plus effects of occlusions, illumination and noise. Now if the add strong enough random error to the scene, the image is stop to be the natural image(that is image of any scene). That mean manifold of the images of the scene is isolated – there is no natural images in it’s neighborhood. That hint that up to some error the space of the natural images is at least is the union of isolated low-dimensional manifolds. The union of mainfolds is obviously is more complex structure than the union of subspace, but methods of blind compressed sensing could be applicable to it too. Of cause to think about union of manifolds could be necessary only if the space of images is not union of subspace, which is obviously preferable case