Split-Bregman for total variation: parameter choice
Split Bregamn method is one of the most used method for solving convex optimization problem with non-smooth regularization. Mostly it used for regularization, and here I want to talk about TV-L1 and similar regularizers ( like
)
For simplest possible one-dimentional denoising case
(1)
Split Bregman iteration looks like
(2)
(3)
(4)
There is a lot of literature about choosing parameter from (1) for denosing. Methods include Morozov discrepancy principle, L-curve and more – that’s just parameter of Tikhonov regularization.
There is a lot less metods for choosing for Split Bregman.
Theoretically shouldn’t be very important – if Split Bregman converge solution is not depending on
. In practice of cause convergence speed and stability depend on
a lot. Here I’ll point out some simple consideration which may help to choose
Assume for simplicity that u is twice differentiable almoste everythere, the (1) become
(5)
We know if Split Bbregman converge
that mean , here exteranl derivative is a weak derivative.
For the solution b is therefore locally constant, and we even know what constant it is.
Recalling
For we have
and accordingly
For converged solution
everythere where is not zero, with possible exeption of measure zero set.
Now returning to choice of for Split Bregman iterations.
From (4) we see that for small values b , b grow with step until it reach it’s maximum absolute value
Also if the b is “wrong”, if we want for it to switch value in one iteration should be more than
That give us lower boundary for :
For most of x
What happens if on some interval b reach for two consequtive iterations?
Then on the next iteration form (5) and (3) on that interval
or
See taht with big enuogh sign of
stabilizing with high probaility and we could be close to true solution. The “could” in the last sentence is there because we are inside the interval, and if the values of u on the ends of interval are wrong b “flipping” can propagate inside our interval.
It’s more difficult to estimate upper boundary for .
Obviously for more easy solution of (2) or (5) shouldn’t be too big, so that eigenvalues of operator
woudn’t be too close to 1. Because solutions of (2) are inexact in Split Breagman we obviously want
having bigger eigenvalues, so that single (or small number of) iteration could suppress error good enough for (2) subproblem.
So in conclusion (and my limited experience) if you can estimate for solution
could be a good value to start testing.
Interesting. Since split Bregman is equivalent to ADMM, have you looked at Boyd’s ADMM paper? He mentions that in the 90′s some people proposed to choose lambda dynamically in such a way that the primal and dual residual converge to zero at roughly the same rate. Although it isn’t a panacea, I had great experience with this rule on denoising and even some tough deblurring problems. Is your method equivalent to this, and if not, which do you think is better?
By the way, this sort of thing is exactly why I think every paper should mention that split Bregman is ADMM…
Primal and dual residual convergence similarity is as I undersyand similar (or is) Morozov discrepancy principle. That could be applyed to mu, but I’m not sure if it would be good for lambda. Fixing of lambda is one of the main advantages of Split Bregman. Another advantage of SB is “error forgetting” – one step residual zeroing for piecewise linear solution and changing lambda wouldn’t accelerate that (there is and interesting paper “error forgetting in Bregman iterations”). However lambda could be changed spatially as lambda(x), that could be helpful if it’spossible to estimate du/dx locally.
Which method is better probably depend on the specific of the problem. Upper bound for lambda is clearly depend on the operator L (5), and it’s usually solved inexactly, with big residue, so it’s depend not only on L but on the method of solving it too. I don’t see any easy rule of choice here, only experimenting…