Tue.1 10:30–11:45 | H 0104 | STO
.

Stochastic Approximation and Reinforcement Learning (3/3)

Chair: Alec Koppel Organizer: Alec Koppel
10:30

Andrzej Ruszczynski

joint work with Saeed Ghadimi, Mengdi Wang

A Single Time-scale Stochastic Approximation Method for Nested Stochastic Optimization

We propose a new stochastic approximation algorithm to find an approximate stationary point of a constrained nested stochastic optimization. The algorithm has two auxiliary averaged sequences (filters) which estimate the gradient of the composite objective function and the inner function value. By using a special Lyapunov function, we show the sample complexity of $O(1/\epsilon^{2})$ for finding an $\epsilon$-approximate stationary point, thus outperforming all extant methods for nested stochastic approximation. We discuss applications to stochastic equilibria and learning in dynamical systems

10:55

Vivak Patel

Using Statistical Filters for Stochastic Optimization

Stochastic optimization methods fall into three general paradigms: the Monte-Carlo (MC)/Sample Average Approximation (SAA) approach, the Bayesian Optimization (BO) approach, and the Stochastic Approximation (SA) approach. While these paradigms enjoy strong theoretical support, these methods are often not always practical for well-documented reasons. In this talk, we introduce a fourth paradigm for solving such optimization problems using statistical filters.

11:20

Angeliki Kamoutsi

joint work with Tobias Sutter, John Lygeros

Randomized methods and PAC bounds for data-driven inverse stochastic optimal control

This work studies discrete-time Markov decision processes (MDPs) with continuous state and action spaces and addresses the inverse problem of inferring a cost function from observed optimal behavior. We propose a method that avoids solving repeatedly the forward problem and at the same time provides probabilistic performance guarantees on the quality of the recovered solution. Our approach is based on the LP formulation of MDPs, complementary slackness optimality conditions, recent developments in convex randomized optimization and uniform finite sample bounds from statistical learning theory.