## December finds in #arxiv

Repost from my googleplus stream

**Computer Vision**

*Non-Local means is a local image denoising algorithm*

Paper shows that non-local mean weights are not identify patches globally point in the images, but are susceptible to aperture problem:

http://en.wikipedia.org/wiki/Optical_flow#Estimation That’s why short radius NLM could be better then large radius NLM. Small radius cutoff play the role of regularizer, similar to the Total Variation in Horn-Shunk Optical flow.

http://en.wikipedia.org/wiki/Horn%E2%80%93Schunck_method (my comment – TV-L1 is generally better than TV-L2 in Horn-Schunk)

http://arxiv.org/abs/1311.3768

**Deep Learning**

*Do Deep Nets Really Need to be Deep?*

Authors state that shallow neural nets can in fact achieve similar performance to deep convolutional nets. The problem though is, that they had to be initialized or preconditioned – they can not be trained using existing algorithms.

And for that initialization they need deep nets. Authors hypothesize that there should be algorithms that allow training of those shallow nets to reach the same performance as deep nets.

http://arxiv.org/abs/1312.6184

*Intriguing properties of neural networks*

The linear combination of deep-level nodes produce the same results as the the original nodes. That suggest that nodes the spaces itself rather it’s representation keep information for deep levels.

The input-output mapping also discontinuous – small perturbations cause misclassification. Those perturbation are not dependent on the training, only on input of classification. (My comment – sparse coding is generally not smooth on input, another argument that sparse coding is part of internal mechanics of deep learning)

http://arxiv.org/abs/1312.6199

*From Maxout to Channel-Out: Encoding Information on Sparse Pathways*

This paper start with observation that max-out is a form of sparse coding: only one of the input pathway is chosen for father processing. From this inferred development of that principle:

remove “middle” layer which “choose” maximum input, and transfer maximal input at once into next level – make choice function index-aware. Some other choice function beside the max is considered, but max still seems the best

Piecewise-constant choice function make interesting reference to previous paper (discontinuity of input-output mapping)

http://arxiv.org/abs/1312.1909

*Unsupervised Feature Learning by Deep Sparse Coding*

This, for a difference is not about convolutional network.

Instead SIFT(or similar) descriptors are used to produce bag-of-words, sparse coding is used with max-out, and manifold learning applied to it. (http://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction)

http://arxiv.org/abs/1312.5783

*Generative NeuroEvolution for Deep Learning*

I’m generally wary of evolutionary methods, but this looks kind of interesting – it’s based on *compositional pattern producing network* (CPPN)- encoding geometric pattern as composition of simple functions.

This CPPN is used to encode connectivity pattern of ANN (Convolutional newtwork most used). Thus complete process is the combination of ANN training and evolutionary CPPN training

http://arxiv.org/abs/1312.5355

*Some Improvements on Deep Convolutional Neural Network Based Image Classification*, *Unsupervised feature learning by augmenting single images*

Botht papers seems about the same subject – squeeze more out of labeled images by applying a lot of transformation to them(Some of those transformations are implemented in cuda-convnet BTW)

http://arxiv.org/abs/1312.5402, http://arxiv.org/abs/1312.5242

*Exact solutions to the nonlinear dynamics of learning in deep linear neural networks*

Analytical exploration of toy 3-layer model *without_ actual non-linear neurons. Model completely linear to input (polynomial to weights). Nevertheless it show some interesting properties, like step in learning curve

http://arxiv.org/abs/1312.6120

**Optimization**

*Distributed Interior-point Method for Loosely Coupled Problems*

Mixing together all my favorite methods: Interior point, Newton, ADMM(Split-Bregman) into one algorithm and make a distribute implementation of it.

Mixing Newton and ADMM, ADMM and Interior point looks risky to me, though with a lot of subiterations it may work(that’s why it’s distributed – require a lot of calculating power)

Also I’m not sure about convergene of the combined algorithm – each step’s convergence is proven, but I’m not sure the same could be applyed to the combination.

Newton and ADMM have kind of contradicting optimal conditions – smoothness vs piecewise linearity. Would like to see more research on this…

http://arxiv.org/abs/1312.5440

*Total variation regularization for manifold-valued data*

Proximal mapping and soft thresholding for manifolds – analog of ADMM for manifolds.

http://arxiv.org/abs/1312.7710

**just interesting stuff**

*Coping with Physical Attacks on Random Network Structures*

Include finding vulnerable spots and results of random attacks

(My comment – shouldn’t it be connected to precolation theory?)

http://arxiv.org/abs/1312.6189

## Finds in arxiv, october

This is duplication of my ongoing G+ series of post on interesting for me papers in arxiv. Older posts are not here but in my G+ thread.

Finds in #arxiv :

*Optimization, numerical & convex, ML*

The Linearized Bregman Method via Split Feasibility Problems: Analysis and Generalizations

Reformulation of Split Bregman/ ADMM as split feasibility problem and algorithm/convergence for generalized split feasibility by Bregman projection. This general formulation include both Split Bregman and Kaczmarz (my comment – randomized Kaczmarz seems could be here too)

http://arxiv.org/abs/1309.2094

Stochastic gradient descent and the randomized Kaczmarz algorithm

Hybrid of randomized Kaczmarz and stochastic gradient descent – into my “to read” pile

http://arxiv.org/abs/1310.5715

Trust–Region Problems with Linear Inequality Constraints: Exact SDP Relaxation, Global Optimality and Robust Optimization

“Extended” trust region for linear inequalities constrains

http://arxiv.org/abs/1309.3000

Conic Geometric Programming

Unifing framwork for conic and geometric programming

http://arxiv.org/abs/1310.0899

http://en.wikipedia.org/wiki/Geometric_programming

http://en.wikipedia.org/wiki/Conic_programming

Gauge optimization, duality, and applications

Another big paper about different, not Lagrange duality, introduced by Freund (1987)

http://arxiv.org/abs/1310.2639

Color Bregman TV

mu parameters in split bregman made adaptive, to exploit coherence of edges in different color channels

http://arxiv.org/abs/1310.3146

Iteration Complexity Analysis of Block Coordinate Descent Methods

Some convergence analysis for BCD and projected gradient BCD

http://arxiv.org/abs/1310.6957

Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation

Nonnegative matrix factorization

http://arxiv.org/abs/1310.7529

Scaling SVM and Least Absolute Deviations via Exact Data Reduction

SVN for large-scale problems

http://arxiv.org/abs/1310.7048

Image Restoration using Total Variation with Overlapping Group Sparsity

While title is promising I have doubt about that paper. The method authors suggest is equivalent to adding averaging filter to TV-L1 under L1 norm. There is no comparison to just applying TV-L1 and smoothing filter interchangeably.The method author suggest is very costly, and using median filter instead of averaging would cost the same while obviously more robust.

http://arxiv.org/abs/1310.3447

*Deep learning*

Deep and Wide Multiscale Recursive Networks for Robust Image Labeling

_Open source_ matlab/c package coming soon(not yet available)

http://arxiv.org/abs/1310.0354

Improvements to deep convolutional neural networks for LVCSR

convolutional networks, droput for speech recognition,

http://arxiv.org/abs/1309.1501v1

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

Already discussed on G+ – open source framework in “learn one use everywhere” stile. Learning done off-line on GPU using ConvNet, and recognition is online in pure python.

http://arxiv.org/abs/1310.1531

Statistical mechanics of complex neural systems and high dimensional data

Big textbook-like overview paper on statistical mechanics of learning. I’ve put it in my “to read” pile.

http://arxiv.org/abs/1301.7115

Randomized co-training: from cortical neurons to machine learning and back again

“Selectron” instead of perception – neurons are “specializing” with weights.

http://arxiv.org/abs/1310.6536

Provable Bounds for Learning Some Deep Representations

http://arxiv.org/abs/1310.6343

Citation:”The current paper presents both an interesting family of denoising autoencoders as

*Computer vision*

Online Unsupervised Feature Learning for Visual Tracking

Sparse representation, overcomplete dictionary

http://arxiv.org/abs/1310.1690

From Shading to Local Shape

Shape restoration from local shading – could be very useful in low-feature environment.

http://arxiv.org/abs/1310.2916

Fast 3D Salient Region Detection in Medical Images using GPUs

Finding interest point in 3D images

http://arxiv.org/abs/1310.6736

Object Recognition System Design in Computer Vision: a Universal Approach

Grid-based universal framework for object detection/classification

http://arxiv.org/abs/1310.7170

Gaming :)

Lyapunov-based Low-thrust Optimal Orbit Transfer: An approach in Cartesian coordinates

For space sim enthusiast :)

http://arxiv.org/abs/1310.4201