Mirror Image

Mostly AR and Stuff

Minimum sum of distance vs L1 and geometric median

All this post is just a more detailed explanation of the end of the previous post.
Assume we want to estimating a state x \in R^n from m \gg n noisy linear measurements y \in R^m, y = Ax + z, z – noise with outliers, like in the paper by Sharon, Wright and Ma Minimum Sum of Distances Estimator: Robustness and Stability
Sharon at al show that minimum L_1 norm estimator, that is
arg \min_{x}  \sum_{i=1}^{m} \| a_i^T x - y_i \|_{1}
is a robust estimator with stable breakdown point, not depending on the noise level. What Sharon did was to use as cost function the sum of absolute values of all components of errors vector. However there are exists another approach.
In one-dimensional case minimum L_1 norm is a median.But there exist generalization of median to R^ngeometric median. In our case it will be
arg \min_{x}  \sum_{i=1}^{m} \| A x - y_i \|_{2}
That is not a least squares – minimized the sum of L_2 norm, not the sum of squares of L_2 norm.
Now why is this a stable and robust estimator? If we look at the Jacobian
\sum_{i=1}^{m} A^T \frac{A x - y_i}{ \| A x - y_i \|_{2}}
we see it’s asymptotically constant, and it’s norm doesn’t depend on the norm of the outliers. While it’s not a formal proof it’s quite intuitive, and can probably be formalized along the lines of Sharon paper.
While first approach with L_1 norm can be solved with linear programming, for example simplex method and interior point method, the second approach with L_2 norm can be solved with second order cone programming and …surprise, interior point method again.
For interior point method, in both cases original cost function is replaced with
\sum f
And the value of f is defined by constraints. For L_1
f_{i} \ge a_ix-y_i, f_{i} \ge -a_ix+y_i
Sometimes it’s formulated by splitting absolute value is into the sum of positive and negative parts
f_{+_{i}} \ge a_ix-y_i, f_{-_{i}} \ge -a_ix+y_i, f_{+_{i}} \ge 0, f_{-_{i}} \ge 0
And for L_2 it’s a simple
f_i \ge \| A x - y_i \|_{2}
Formulations are very similar, and stability/performance are similar too (there was a paper about it, just had to dig it out)

10, April, 2011 Posted by | sci | , , , , , | 2 Comments