joint work with Daoli Zhu
In this paper, we develop a new variational approach on level sets aiming towards linear convergence of a variable Bregman stochastic coordinate descent (VBSCD) method for a broad class of nonsmooth and nonconvex optimization problem. Along this way, we not only derive the convergence of VBSCD, that is any accumulation of the sequence generated by VBSCD is a critical point, but also provide $\{\mathbb{E}_{\xi_{k-1}}F(x^k)\}$ $Q-$linear and almost surely $\{\mathbb{E}_{\xi_{k-1}}x^k\}$ $R-$linear convergence rate.
joint work with Sindri Magnússon, Mikael Johansson
The veritable scale of modern data necessitates information compression in distributed optimization for machine learning. Gradient compression schemes using memory-based error compensation have displayed superior performance in practice. We illustrate how these approaches can be improved by using Hessian information in the compression. Explicit expressions for the accuracy gains on strongly convex and non-convex optimization problems are derived. The superiority of our approach in terms of convergence speed and solution accuracy is illustrated in numerical examples with 1-bit compression.
joint work with Chenxin Ma, Naga Venkata C. Gudapati, Majid Jahani, Martin Takáč
This talk discusses several first order methods for minimizing strongly convex functions in both the smooth and composite cases. The algorithms, based on efficiently updating lower bounds on the objective functions, have natural stopping conditions that provide the user with a certificate of optimality. Convergence of all algorithms is guaranteed via a new Underestimate Sequence framework, and the algorithms converge linearly, with the accelerated variants converging at the optimal linear rate.