ICCOPT 2019

13:45

Yin Zhang

Data clustering: a subspace-matching point of view and K-scalable algorithms

Data clustering is a fundamental unsupervised learning strategy. The most popular clustering method is arguably the K-means algorithm (K being the number of clusters), albeit it is usually applied not directly to a dataset given, but to its image in a processed "feature space" (such as the case in spectral clustering). The K-means algorithm, however, has been observed to have a deteriorating performance as K increases. In this talk, we will examine some clustering models from a subspace-matching viewpoint and promote a so-called K-indicators model. We apply a partial convex-relaxation scheme to this model and construct an "essentially deterministic" algorithm requiring no randomized initialization. We will present theoretical results to justify the proposed algorithm and give extensive numerical results to show its superior scalability over K-means as the number of clusters grows.

14:10

Shuzhong Zhang

joint work with Tianyi Lin, Bo Jiang

Optimization Methods for Learning Models: Adaptation, Acceleration, and Subsampling

In this talk we present a suite of algorithms for solving optimization models arising from applications in machine learning and statistics. Popular optimization methods for solving such problems include the high-order tensor approximation approach, which requires the knowledge on some problem parameters. To make such methods practical, one will need to find ways of implementation without such knowledge. We discuss methods that exhibit an accelerated iteration bound while maintaining the traits of being adaptive.

14:35

Bo Jiang

joint work with Haoyue Wang, Shuzhong Zhang

An Optimal High-Order Tensor Method for Convex Optimization

Optimization algorithms using up to the d-th order derivative information of the iterative solutions are called high-order tensor methods. In this talk, we propose a new high-order tensor algorithm for the general composite case, with the iteration complexity of $O(1/k^(3d+1)/2)$, which matches the lower bound for the d-th order tensor methods, hence is optimal. Our approach is based on the so-called Accelerated Hybrid Proximal Extragradient (A-HPE) framework.

Optimization Methods for Machine Learning and Data Science