joint work with Ying Cui, Ziyu He
We present in this paper a novel deterministic algorithmic framework that enables the computation of a directional stationary solution of the empirical deep neural network training problem formulated as a multi-composite optimization problem with coupled nonconvexity and nondifferentiability. This is the first time to our knowledge that such a sharp kind of stationary solutions is provably computable for a nonsmooth deep neural network. Numerical results from a matlab implementation demonstrate the effectiveness of the framework for solving reasonably sized networks.
We consider high-dimensional nonconvex statistical optimization problems such as the regularized square-root Lasso problem, where in the objective the loss terms are possibly nonsmooth or non-Lipschitzian and the regularization terms are the difference of convex functions. We introduce a majorized proximal point dual Newton algorithm (mPPDNA) for solving these large scale problems. We construct proper majorized proximal terms that are conducive to the design of an efficient dual based semismooth Newton method of low computational complexities. Extensive numerical results are provided.
joint work with Ying Sun, Ye Tian, Guang Cheng
We study distributed high-dimensional M-estimation problems over networks, modeled as arbitrary graphs (with no centralized node). We design the first class of distributed algorithms with convergence guarantees for such a class of Empirical Risk Minimization (ERM) problems. We prove that, under mild technical assumptions, the proposed algorithm converges at a linear rate to a solution of the ERM that is statistically optimal. Furthermore, for a fixed network topology, such a linear rate is invariant when the problem dimension scales properly with respect to the sample size.