Abstract
Motivated by practical needs such aslarge-scale learning, we study the impact of the adaptivity constraints toonline learning and decision-making problems. Unlike traditional onlinelearning problems which allow full adaptivity at the per-time-step scale, ourwork investigates the models where the learning strategy cannot frequentlychange and therefore enables the possibility of parallelization.
In this talk, I will focus on batchlearning, a particular learning-with-limited-adaptivity model, and show thatonly O(log log T) batches are needed to achieve the optimal regret for thepopular linear contextual bandit problem. Along the way in the proof, I willalso introduce the distributional optimal design, a natural extension of theoptimal experiment design in statistical learning, and introduce ourstatistically and computationally efficient learning algorithm for thedistributional optimal design, which may be of independent interest.
Time
2021-06-18 16:30-17:00Speaker
Yuan Zhou, University of Illinois at Urbana-ChampaignRoom
Guangdong Hotel Shanghai