Skip to main navigation Skip to search Skip to main content

Time-Varying Gaussian Process Bandit Optimization with Experts: No-Regret in Logarithmically-Many Side Queries

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We study a time-varying Bayesian optimization problem with bandit feedback, where the reward function belongs to a Reproducing Kernel Hilbert Space (RKHS). We approach the problem via an upper-confidence bound Gaussian Process algorithm, which has been proven to yield no-regret in the stationary case. The time-varying case is more challenging and no-regret results are out of reach in general in the standard setting. As such, we instead tackle the question of how many additional observations asked to an expert are required to regain a no-regret property. To do so, we formulate the presence of past observation via an uncertainty injection procedure, and we reframe the problem as a heteroscedastic Gaussian Process regression. In addition, to achieve a no-regret result, we discard long outdated observations and replace them with updated (possibly very noisy) ones obtained by asking queries to an external expert. By leveraging and extending sparse inference to the heteroscedastic case, we are able to secure a no-regret result in a challenging time-varying setting with only logarithmically-many side queries per time step. Our method demonstrates that minimal additional information suffices to counteract temporal drift, ensuring efficient optimization despite time variation.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Proceedings
EditorsRita P. Ribeiro, Carlos Soares, João Gama, Bernhard Pfahringer, Nathalie Japkowicz, Pedro Larrañaga, Alípio M. Jorge, Pedro H. Abreu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages164-182
Number of pages19
ISBN (Print)9783032060952
DOIs
Publication statusPublished - 1 Jan 2026
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025 - Porto, Portugal
Duration: 15 Sept 202519 Sept 2025

Publication series

NameLecture Notes in Computer Science
Volume16017 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025
Country/TerritoryPortugal
CityPorto
Period15/09/2519/09/25

Keywords

  • Bandit feedback
  • Gaussian Processes
  • Sparse inference
  • Time-varying optimization
  • Upper confidence bounds

Fingerprint

Dive into the research topics of 'Time-Varying Gaussian Process Bandit Optimization with Experts: No-Regret in Logarithmically-Many Side Queries'. Together they form a unique fingerprint.

Cite this