Online learning and blackwell approachability in quitting games

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the sequential decision problem known as regret minimization, or more precisely its generalization to the vectorial or multi-criteria setup called Blackwell approachability. We assume that Nature, the decision maker, or both, might have some quitting (or terminating) actions so that the stream of payoffs is constant whenever they are chosen. We call those environments “quitting games”. We characterize convex target sets C that are Blackwell approachable, in the sense that the decision maker has a policy ensuring that the expected average vector payoff converges to C at some given horizon known in advance. Moreover, we also compare these results to the cases where the horizon is not known and show that, unlike in standard online learning literature, the necessary or sufficient conditions for the anytime version of this problem are drastically different than those for the fixed horizon.

Original languageEnglish
Pages (from-to)941-942
Number of pages2
JournalJournal of Machine Learning Research
Volume49
Publication statusPublished - 6 Jun 2016
Externally publishedYes
Event29th Conference on Learning Theory, COLT 2016 - New York, United States
Duration: 23 Jun 201626 Jun 2016

Keywords

  • Absorbing Games
  • Blackwell Approachability
  • Markov Decision Process
  • Online Learning

Fingerprint

Dive into the research topics of 'Online learning and blackwell approachability in quitting games'. Together they form a unique fingerprint.

Cite this