Tue.2 13:15–14:30 | H 2033 | STO
.

Large-Scale Stochastic First-Order Optimization (1/2)

Chair: Samuel Horvath Organizers: Samuel Horvath, Filip Hanzely
13:15

Zheng Qu

joint work with Fei Li

Adaptive primal-dual coordinate descent methods for nonsmooth composite minimization with linear operator

We propose and analyse an inexact augmented Lagrangian (I-AL) algorithm for solving large-scale composite, nonsmooth and constrained convex optimization problems. Each subproblem is solved inexactly with self-adaptive stopping criteria, without requiring the target accuracy a priori as in many existing variants of I-AL methods. In addition, each inner problem is solved by applying accelerated coordinate descent method, making the algorithm more scalable when the problem dimension is high.

13:40

Pavel Dvurechensky

joint work with Alexander Gasnikov, Alexander Tiurin

A Unifying Framework for Accelerated Randomized Optimization Methods

We consider smooth convex optimization problems with simple constraints and inexactness in the oracle information such as value, partial or directional derivatives of the objective function. We introduce a unifying framework, which allows to construct different types of accelerated randomized methods for such problems and to prove convergence rate theorems for them. We focus on accelerated random block-coordinate descent, accelerated random directional search, accelerated random derivative-free method.

14:05

Xun Qian

joint work with Peter Richtárik, Robert Gower, Alibek Sailanbayev, Nicolas Loizou, Egor Shulgin

SGD: General Analysis and Improved Rates

We propose a general yet simple theorem describing the convergence of SGD under the arbitrary sampling paradigm. Our analysis relies on the recently introduced notion of expected smoothness and does not rely on a uniform bound on the variance of the stochastic gradients. By specializing our theorem to different mini-batching strategies, such as sampling with replacement and independent sampling, we derive exact expressions for the stepsize as a function of the mini-batch size.