joint work with Robert Gower, Joseph Salmon
Recently, it has been shown that the step sizes of a family of stochastic variance reduced gradient methods, called the JacSketch methods, depend on the expected smoothness constant. We provide closed form expressions for this constant, leading to larger step sizes and thus to faster convergence, and numerical experiments verifying these bounds. We suggest a new practice for setting the mini-batch and step sizes for SAGA, part of the JacSketch family. Furthermore, we can now show that the total complexity of the SAGA algorithm decreases linearly in the mini-batch size up to an optimal value.
joint work with Pontus Giselsson
We propose a convergence acceleration scheme for optimization problems in stochastic settings. We use Anderson acceleration in combination with a reduced variance stochastic gradient method and show that the iterates are stochastically quasi-Fejér monotone. This allows us to prove almost sure convergence of the method. We apply the acceleration technique to reduced variance stochastic gradient algorithms SVRG and SAGA to show its performance gains.
We consider distributed optimization where the objective function is spread among different devices, each sending incremental model updates to a central server. To alleviate the communication bottleneck, recent work proposed various schemes to compress (e.g. quantize or sparsify) the gradients, thereby introducing additional variance, that might slow down convergence. We provided a unified analysis of these approaches in strongly/weakly convex and non-convex frameworks and provide the first methods that achieve linear convergence for arbitrary quantized updates.