Batched bandit problems

Research output: Contribution to journalArticlepeer-review

Abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Original languageEnglish
Pages (from-to)660-681
Number of pages22
JournalAnnals of Statistics
Volume44
Issue number2
DOIs
Publication statusPublished - 1 Apr 2016
Externally publishedYes

Keywords

  • Batches
  • Grouped clinical trials, sample size determination, switching cost
  • Multi-armed bandit problems
  • Multi-phase allocation
  • Regret bounds

Fingerprint

Dive into the research topics of 'Batched bandit problems'. Together they form a unique fingerprint.

Cite this