Passer à la navigation principale Passer à la recherche Passer au contenu principal

From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

The stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm’s distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problems but sometimes they require knowledge (on tails for instance) that may not be precisely accessible to the practitioner, raising the question of the robustness of bandit algorithms to model misspecification. In this paper we study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms’ observations and a data-dependent exploration bonus. We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition. We also show that a simple tuning achieve robustness with respect to a large class of unbounded distributions, at the cost of slightly worse than logarithmic asymptotic regret. We finally provide numerical experiments showing the merits of DS in a decision-making problem on synthetic agriculture data.

langue originaleAnglais
titreAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
rédacteurs en chefMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
EditeurNeural information processing systems foundation
Pages14029-14041
Nombre de pages13
ISBN (Electronique)9781713845393
étatPublié - 1 janv. 2021
Modification externeOui
Evénement35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
Durée: 6 déc. 202114 déc. 2021

Série de publications

NomAdvances in Neural Information Processing Systems
Volume17
ISSN (imprimé)1049-5258

Une conférence

Une conférence35th Conference on Neural Information Processing Systems, NeurIPS 2021
La villeVirtual, Online
période6/12/2114/12/21

Empreinte digitale

Examiner les sujets de recherche de « From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation