ORIE Colloquium
Description: Online marketplaces and recommendations systems rely on historical data to optimize content and user-interactions. But, further, the data generated from these interactions is fed back into the system and used to optimize future interactions. As this cycle continues, good performance requires algorithms capable of learning actively through sequential interactions, systematically experimenting to improve future performance, and balancing this experimentation with the desire to make decisions with most immediate benefit. Thompson Sampling is a surprisingly simple and flexible Bayesian heuristic for handling this exploration-exploitation tradeoff in online decision problems. While this basic algorithmic technique can be traced back to 1933, the last five years have seen an unprecedented growth in the theoretical understanding as well as commercial interest in this method. In this talk, I will discuss our work in design and analysis of Thompson Sampling based algorithms for several classes of multi-armed bandits, online assortment selection, and reinforcement learning learning problems. We demonstrate that natural versions of the Thompson Sampling heuristic achieve near-optimal theoretical performance bounds for these problems, along with attractive empirical performance.
This talk is based on joint works with Vashist Avadhanula, Navin Goyal, Vineet Goyal, Randy Jia, and Assaf Zeevi.
Bio: Shipra Agrawal is an Assistant Professor in the Department of Industrial Engineering and Operations Research, and Data Science Institute at Columbia University. She received her Ph.D. from Stanford University in June 2011, and was a researcher at Microsoft Research India from July 2011 to August 2015. Her research spans several areas of optimization and machine learning, including online optimization under uncertainty, multi-armed bandits, online learning, and reinforcement learning. She is also interested in prediction markets, game theory, and mechanism design. Application areas of her interests include internet advertising, recommendation systems, revenue management, and resource allocation. Shipra serves as an associate editor for Management Science (Optimization area) and Mathematics of Operations Research (Learning theory area) journals, and as a member of ACM future of computing academy. She is the recipient of Google faculty research award 2017, and Amazon research award 2017.