CSIP Seminar: Beyond UCB: The Curious Case of Non-linear Ridge Bandits

Center for Signals and Information Processing (CSIP) Seminar

Date: Tuesday, April 11, 2023

Time: 3:00 p.m. - 4:00 p.m. EST

Location: Centergy Building 5126. The associated zoom link is: https://gatech.zoom.us/j/99851266161.

Speaker: Nived Rajaraman

Speakers' Title: Fourth year Ph.D. student in the EECS Department at Berkeley

Seminar Title: Beyond UCB: The Curious Case of Non-linear Ridge Bandits

Abstract: There is a large volume of work on bandits and reinforcement learning when the reward/value function satisfies some form of linearity. But what happens if the reward is non-linear? Two curious phenomena arise for non-linear bandits: first, in addition to the "learning phase" with a standard regret, there is an "initialization phase" with a fixed sample cost determined by the nature of the reward function; second, achieving the smallest sample cost in the initialization phase requires new learning algorithms beyond traditional ones such as UCB.

For a special family of non-linear bandits taking the form of a “ridge" function for a non-linear monotone function , we derive upper and lower bounds on the optimal fixed cost of learning, and in addition, on the entire “learning trajectory” via differential equations. In particular, we propose a two-stage exploration algorithm which first finds a good initialization, and subsequently exploits local linearity in the learning phase. We prove that this algorithm is statistically optimal. In contrast, several classical and celebrated algorithms, such as UCB and algorithms relying on online/offline regression oracles, are proven to be suboptimal.

This is based on a recent joint work with Yanjun Han, Jiantao Jiao, and Kannan Ramchandran: https://arxiv.org/abs/2302.06025.

Speaker Bio: Nived Rajaraman is currently a 4th year PhD student in the EECS Department at Berkeley, advised by Jiantao Jiao and Kannan Ramchandran. He received his undergraduate degree from IIT Madras in 2019. His research interests lie in reinforcement learning, online learning and bandits, statistical machine learning and its interplay with non-convex optimization.

When

Tuesday, Apr 11, 2023

3:00 pm - 4:00 pm

Location

Centergy Building 5126

Contact Information

Contact

Kiran Kokilepersaud
kpk6@gatech.edu

Virtual Link"> More Information

This event is open to:

Invited audience

Faculty/Staff

Public

Undergraduate students

View on Campus Calendar

Tue, 04/11/2023 - 15:00 - Tue, 04/11/2023 - 16:00

Explore Campus Events

Nov

Search

CSIP Seminar: Beyond UCB: The Curious Case of Non-linear Ridge Bandits

Center for Signals and Information Processing (CSIP) Seminar

When

Location

Contact Information

This event is open to:

Explore Campus Events

Veterans Day

Veterans Day

Veterans Day

Veterans Day

Georgia Institute of Technology