joint work with Ioannis Paschalidis, Artin Spiridonoff
We consider stochastic gradient descent in a network of n nodes: every node knows a strongly convex function, and the goal is to compute the minimizer of the sum of these n functions. At each time step, every node can broadcast a message to its neighbors in a (directed) network and query its own function for a noisy gradient. We describe a method which has the property that we call asymptotic network independence: in the limit, the distributed system converges to the global minimizer as fast as a centralized method that can obtain all of the n of the noisy gradients at every time step.
joint work with Alexey Kroshnin, Darina Dvinskikh, Nazarii Tupitsa, Cesar A Uribe, Pavel Dvurechensky
We study two distributed algorithms for approximating Wasserstein barycenter of a set of discrete measures. The first approach is based on the Iterative Bregman Projections (IBP) algorithm for which our novel analysis gives a complexity bound proportional to $mn^2/e^2$ to approximate the original non-regularized barycenter. Using an alternative accelerated-gradient-descent-based approach, we obtain a complexity proportional to $mn^{2.5}/e$.
joint work with Aryan Mokhtari, Cesar A Uribe, Suvrit Sra, Ali Jadbabaie
We study gradient-based optimization methods obtained by directly discretizing a second-order heavy-ball ordinary differential equation (ODE). When the function is smooth enough and convex, we show that acceleration can be achieved by a stable discretization using standard Runge-Kutta integrators. Furthermore, we introduce a new local flatness condition under which rates even faster than Nesterov’s method can be achieved with low-order integrators. Similar techniques can be applied to strongly convex first order optimization problems and distributed optimization problems.